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Abstract 

This  research  outlines  the  development  and  simulation  of  a  signal  processing  approach 
to  real  time  wavefront  curvature  sensing  in  adaptive  optics.  The  signal  processing  approach 
combines  vectorized  Charge  Coupled  Device  (CCD)  read  out  with  a  wavefront  modal  esti¬ 
mation  technique.  The  wavefront  sensing  algorithm  analyzes  vector  projections  of  image 
intensity  data  to  provide  an  estimate  of  the  wavefront  phase  as  a  combination  of  several  low 
order  Zernike  polynomial  modes.  This  wavefront  sensor  design  expands  on  an  existing  idea 
for  vector  based  tilt  sensing  by  providing  the  ability  to  compensate  for  additional  modes. 
Under  the  proposed  wavefront  sensing  approach,  the  physical  wavefront  sensor  would  be 
replaced  by  a  pair  of  imaging  devices  capable  of  generating  vector  projections  of  the  image 
data.  Using  image  projections  versus  two-dimensional  image  data  allows  for  faster  CCD 
read  out  and  decreased  read  noise. 

The  primary  research  contribution  is  to  create  an  effective  method  for  estimating 
low  order  wavefront  modes  from  image  vector  information.  This  dissertation  provides 
simulation  results  and  Cramer- Rao  performance  bounds  for  two  wavefront  sensor  designs. 
The  first  sensor  provides  estimates  of  tilt  and  defocus:  Zernike  polynomials  2  through  4. 
The  second  sensor  estimates  Zernike  polynomials  2  through  10.  Sensors  are  simulated  in 
guide  star  applications  under  the  influence  of  von  Karrnan  atmospheric  phase  aberrations 
and  CCD  noise  models.  Secondary  research  contributions  include  identifying  key  algorithm 
performance  parameters,  and  parameter  sensitivity  as  well  as  an  investigation  of  strategies 
for  improving  extensible  phase  screen  generation. 

A  simulated  performance  comparison  is  conducted  between  the  Z2-4  and  the  Z2-10 
sensors,  and  a  centroiding  tilt  sensor  and  a  projection  based  maximum  likelihood  tilt  sensor. 
Simulation  trials  using  a  subaperture  diameter  of  0.07m  stepped  through  values  of  ro  from 
0.04  to  0.14nr  and  average  photon  counts  of  100  to  1000.  The  Z2-4  sensor  provides  superior 
performance  over  both  tilt  sensors  in  all  trials  conducted.  The  ^2-10  sensor  outperforms 
both  tilt  sensors  when  the  average  photon  count  is  greater  than  200  photons,  and  perfor¬ 
mance  on  par  with  both  tilt  sensors  when  the  average  photon  count  is  between  100  and  200 
photons. 
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9.14  Comparison  of  simulated  projection  based  ML  tilt  estimator  perfor¬ 
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9.15  Comparison  of  simulated  projection  based  ML  tilt  estimator  perfor¬ 


mance  to  the  Z2-10  estimator  over  a  range  of  rp  and  K  values 
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9.16  Solid  lines  indicate  Z2-10  residual  MSE  versus  ro  estimate.  Dashed 
lines  represent  the  ^2,3  ML  estimator  performance  threshold.  The 
true  value  of  ro  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line) . 

Triangles  indicate  ±1(7.  K  =  100Q.| .  9-15 


9.17  Solid  lines  indicate  ^2-10  residual  MSE  versus  ro  estimate.  Dashed 
lines  represent  the  ^2,3  ML  estimator  performance  threshold.  The 
true  value  of  ro  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line) . 

Triangles  indicate  =fcl a.  I\  =  20H] .  9-16 


9.18  Solid  lines  indicate  residual  MSE  versus  Lp  estimate.  Dashed  lines 

represent  the  ^2,3  ML  estimator  performance  threshold.  The  true 
value  of  Lq  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line). 

Triangles  indicate  ±1  a.  I\  =  100Q.| .  9-17 


9.19  Solid  lines  indicate  Z2-10  residual  MSE  versus  Lp  estimate.  Dashed 
lines  represent  the  ML  estimator  performance  threshold.  The 
true  value  of  Lq  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line) . 

Triangles  indicate  ±1  a.  K  =  2007| .  9-18 
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9.20  Solid  lines  indicate  Z2-10  residual  MSE  versus  Ip  estimate.  Dashed 
lines  represent  the  ^2,3  ML  estimator  performance  threshold.  The 
true  value  of  Iq  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line). 

Triangles  indicate  ±1  a.  I\  =  100Q.| .  9-19 

9.21  Solid  lines  indicate  ^2-10  residual  MSE  versus  Iq  estimate.  Dashed 
lines  represent  the  ^2,3  ML  estimator  performance  threshold.  The 
true  value  of  lo  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line). 

Triangles  indicate  Tier.  I\  =  200. | .  9-20 
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WAVEFRONT  CURVATURE  SENSING  FROM  IMAGE  PROJECTIONS 


1.  Introduction 

This  work  includes  the  derivation  and  simulated  performance  of  a  fast,  efficient  al¬ 
gorithm  for  real  time  wavefront  curvature  sensing.  Real  time  wavefront  sensing  falls  into 
two  categories:  interferometric  measurement  of  phase  or  phase  slope,  and  estimation  of  the 
phase  from  image  intensity  characteristics.  The  proposed  wavefront  sensing  method  falls 
into  the  latter  category.  Phase  estimators  may  be  further  distinguished  by  the  number 
and  the  order  of  the  aberrations,  or  modes,  they  estimate.  The  category  "tilt  sensors," 
for  instance,  is  often  used  in  reference  to  linear  mode  estimators.  Estimators  that  provide 
phase  information  beyond  the  linear  tilt  modes  may  be  referred  to  as  "curvature  sensors." 
Linear  modes  are  primarily  comprised  of  low  frequency  phase  characteristics.  Linear  mode 
estimators  are  most  accurate  over  small  regions  of  the  wavefront;  and  consequently,  tilt  es¬ 
timators  must  use  highly  parallel  systems  with  many  small  subapertures  to  provide  a  global 
wavefront  map.  Curvature  sensors  estimate  higher  frequency  modes  and  are  generally  ef¬ 
fective  over  larger  regions  of  the  wavefront.  This  research  will  outline  a  fast,  effective  tilt 
sensing  technique  [T]  and  extend  the  technique  to  include  higher  order  parameter  estima¬ 
tion.  The  following  sections  will  define  common  methods  of  wavefront  sensing  and  identify 
the  motivation  behind  wavefront  sensing  devices. 

1.1  The  Random  Atmosphere 

The  degree  to  which  two  point  sources  will  be  resolved  by  an  imaging  device  in  free 
space  will  be  limited  by  diffraction  effects  directly  tied  to  the  size  of  the  aperture,  D,  and 
the  wavelength,  A  [21  This  is  because  the  width  of  a  point  source  in  the  image  plane  is 
essentially  the  width  of  the  central  spot  of  a  circular  diffraction  pattern,  commonly  referred 
to  as  a  Rayleigh  distance: 

A  f 

Rayleigh  distance  =  1.22  —  ,  (1.1) 

where  A  is  the  wavelength  of  the  source  and  /  is  the  geometric  focal  length.  The  size  of 
a  Rayleigh  distance  is  inversely  proportional  to  the  aperture  diameter  indicating  that  large 


1-1 


apertures  will  yield  better  resolving  power.  This  effect  is  shown  in  Figure [Llldemonstrating 
overlapping  diffraction  patterns  from  a  circular  aperture. 


Figure  1.1  Simulated  Airy’s  disks  for  two  different  aperture  diameters.  The  aperture  used 
for  the  images  on  the  right  side  is  double  the  diameter  of  the  aperture  used  to 
form  the  images  on  the  left  side. 

Optical  imaging  systems  designed  to  resolve  objects  through  the  earth’s  atmosphere 
must  contend  with  the  degrading  effects  of  its  continuously  fluctuating  index  of  refraction. 
This  condition  is  commonly  referred  to  as  atmospheric  turbulence.  Newton,  though  not 
convinced  of  the  wave  theory  of  light,  was  aware  of  diffraction  effects  and  the  benefits  of  a 
large  aperture  on  resolution.  He  was  also  aware  of  the  added  limitations  of  imaging  through 
atmospheric  turbulence  |3I. 

“Long  Telescopes  may  cause  Objects  to  appear  brighter  and  larger  than  the 
short  ones  can  do,  but  they  cannot  be  so  formed  as  to  take  away  the  confusion 
of  the  Rays  which  arises  from  the  Tremors  of  the  Atmosphere  [3].” 

When  speaking  of  the  "confusion  of  the  Rays,"  Newton  was  describing  the  effects  of 
the  atmosphere’s  fluctuating  index  of  refraction.  Due  to  wind  and  temperature  gradients, 
the  atmosphere  churns  and  tumbles  as  it  flows  over  the  Earth.  The  turbulence  contains 
continuously  evolving  temperature  and  pressure  variations.  Since  temperature  and  pressure 
relate  directly  to  the  index  of  refraction,  the  index  of  refraction  varies  continuously  as  well 
[5] .  Imagine  that  a  column  of  atmosphere  is  divided  into  many  discrete  segments  each  with 


1-2 


a  different  index  of  refraction.  Snell’s  law  in  ray  optics  predicts  that  the  path  of  a  single 
ray  will  bend  as  it  transitions  through  each  of  these  segments.  Hence,  propagation  through 
atmospheric  turbulence  gives  rise  to  random  variations  in  the  optical  path.  Unlike  in  free 


Figure  1.2  Top:  Ray  propagating  through  a  layered  medium  with  multiple  indexes  of 
refraction.  Bottom:  Ray  propagating  through  a  medium  with  a  single  index 
of  refraction. 

space,  a  ray’s  path  through  turbulence  will  not  propagate  in  a  straight  line,  but  will  wander 
slightly  as  suggested  by  the  top  diagram  in  Figure |U2j  Combine  this  random  wander  in  the 
ray  path  with  the  diffraction  pattern  for  a  point  source  and  the  result  is  a  spot  image  that 
wanders  around  in  the  image  plane  as  the  atmosphere  evolves.  The  imaging  system  will 
apply  some  integrated  exposure  to  these  random  spot  movements  and,  consequently,  the 
image  becomes  a  broadened  diffraction  pattern.  The  amount  of  broadening  is  related  to  the 
turbulence  in  the  optical  path.  For  ground  to  space  seeing  conditions,  these  atmospheric 
effects  become  the  dominant  contributor  to  resolving  power  when  the  aperture  size  is  larger 
than  a  few  centimeters.  Technology  continues  to  offer  inventive  ways  to  counter  these 
atmospheric  effects.  Today’s  most  powerful  terrestrial  telescopes  "sense"  the  conditions  of 
the  atmosphere  and  react  to  improve  seeing  conditions.  The  sensing  capability  relies  on 
the  concept  of  an  optical  wavefront  which  contains  a  measure  of  the  atmospheric  effects. 
This  dissertation  will  review  the  concepts  necessary  for  a  basic  understanding  of  these 
atmospheric  phenomenon.  The  background  will  provide  the  foundation  for  developing  an 
improved  method  for  wavefront  sensing. 
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1.2  The  Wavefront 


Since  this  work  is  concerned  with  wavefront  sensing,  it  is  necessary  to  develop  the 
concept  of  a  wavefront  or  phasefront.  For  the  purpose  of  this  work,  the  wavefront  is  defined 
as  the  difference  between  some  reference  field,  predicted  by  free  space  propagation,  and  the 
actual  field  in  the  aperture  of  an  imaging  system.  For  a  point  source  object,  this  reference 
field  has  a  simple  geometric  formulation.  Consider  optical  energy  emanating  from  a  point 
source.  The  wave  propagation  is  equal  in  all  directions.  The  sphere  of  radius  R  with  center 


Spheres  of  constant  amplitude  and  phase 


Figure  1.3  Graphical  depiction  of  a  sperical  wavefront  emanating  from  a  point  source. 


located  at  the  point  source  represents  a  surface  of  constant  amplitude  and  phase.  Now, 
imagine  that  the  point  source  is  far  away  and  R  becomes  very  large.  If  R  is  very  large  then 
the  small  portion  of  the  spherical  wavefront  interfacing  with  the  imaging  system  is  planar 
to  close  approximation.  In  many  circumstances,  light  from  a  distant  source  is  accurately 
modeled  as  a  plane  wave  (planar  wavefront)  over  the  optical  system  aperture.  In  some 
instances  it  is  more  practical  to  discuss  the  effects  of  an  optical  system  after  attempting  to 
focus  the  planar  wavefront.  In  these  cases,  it  may  be  more  appropriate  to  use  a  spherical 
reference  wavefront.  For  instance,  consider  the  effects  of  an  imperfect  optical  system  on  a 
planar  wavefront.  The  difference  between  the  focused  wavefront  from  a  perfect  spherical 


wave  reveals  the  imperfections  in  the  system.  This  situation  is  demonstrated  in  Figure  1.3 


Whatever  the  form  of  the  reference  wavefront,  wavefront  sensors  are  designed  to  measure 
the  difference  between  the  incoming  wavefront  that  reference.  The  turbulent  atmosphere 


will  acts  as  an  aberrating  thick  lens  warping  the  wavefront  as  it  propagates.  Figure  1.5 
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Figure  1.4  Diagram  shows  a  reference  planar  wavefront  passing  through  an  imperfect 
sperical  lens.  Aberrations  in  the  lens  system  can  be  described  by  comparing 
the  outgoing  aberrated  wavefront  to  a  reference  spherical  wavefront. 

demonstrates  the  effects  of  a  large  volume  of  atmosphere  on  a  plane  wave.  A  sensor 
capable  of  measuring  the  amount  of  distortion  created  by  the  atmosphere  would  enable  an 
optical  system  to  sense  atmospheric  effects  and,  given  a  reactionary  capability,  somehow 
compensate  for  these  effects. 

1.3  Adaptive  Optics 

An  adaptive  optics  system  employs  a  wavefront  sensor  in  a  feedback  path.  The 
wavefront  sensor  provides  an  error  measure  to  some  system  of  actively  controlled  optics. 
The  controllable  optics  are  then  capable  of  compensating  for  the  wavefront  error.  Adaptive 
optics  systems  may  be  used  to  improve  performance  of  imaging  systems  or  laser  propagation 
systems.  Although  the  purpose  of  the  two  types  of  systems  is  dramatically  different,  the 
feedback  and  control  mechanisms  used  to  increase  performance  are  remarkably  similar. 
The  technology  for  such  systems  has  been  evolving  since  conception  in  the  1950s  [6 j .  These 
systems  are  typically  constructed  using  a  telescope,  an  active  or  passive  beacon,  a  wavefront 
sensor,  a  deformable  mirror  and  control  electronics.  Figure  [L6] shows  a  block  diagram  of  an 
adaptive  optics  system  [3] .  The  beacon  is  used  to  provide  the  reference  wavefront  discussed 
in  the  previous  section.  In  celestial  imaging,  the  beacon  may  be  formed  from  a  neighboring 
bright  star,  called  a  natural  guide  star.  When  no  such  guide  star  exists,  the  object  of 
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Figure  1.5  Diagram  demostrates  how  the  atmosphere  will  tilt  and  dimple  an  incoming 
planar  wavefront. 


interest  itself  may  be  used.  Dim  objects  and  extended  objects  present  further  issues,  and 
in  such  cases  an  artificial  guide  star  formed  by  laser  reflection  from  the  upper  atmosphere 
may  be  used.  The  wavefront  sensor  provides  a  measurement  of  the  atmospheric  distortion 
at  the  input  aperture  in  the  form  of  a  direct  wavefront  measurement  or  a  wavefront  slope 


measurement.  A  brief  discussion  of  various  types  of  wavefront  sensors  follows  in  Section  1.4 


and  a  discussion  in  greater  detail  is  included  in  Chapter  [4]  The  deformable  mirror  consists 
of  a  mechanically  actuated  device  capable  of  forming  the  conjugate  phase  measured  by  the 
wavefront  sensor.  The  conjugate  phase  may  be  divided  into  a  tilt  component  and  higher 
order  components,  in  which  case  the  system  may  include  a  gimbaled  mirror  designated  for 
global  tilt  correction  and  a  deformable  mirror  used  to  conjugate  higher  order  effects.  The 
control  electronics  are  responsible  for  mapping  the  conjugate  wavefront  from  the  wavefront 
sensor  measurement  to  the  actuator  commands  for  a  deformable  mirror. 


Two  major  challenges  to  be  overcome  when  designing  an  adaptive  optics  system  in¬ 
clude:  obtaining  adequate  levels  of  light  for  wavefront  sensor  performance,  and  maintaining 
the  bandwidth  necessary  for  active  atmospheric  compensation.  Light  from  the  beacon  must 
be  routed  to  all  necessary  wavefront  sensing  devices.  Ensuring  adequate  signal  to  noise 
ratio  is  present  in  all  optical  detectors  is  critical  to  performance.  If  the  beacon  light  shares 
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Figure  1.6  Block  diagram  of  an  Adaptive  Optics  system  [3!. 


the  same  path  with  the  object  light  then  conserving  light  for  the  imaging  system  becomes  a 
trade-off  with  providing  light  to  the  wavefront  sensor.  Real  time  correction  for  atmospheric 
effects  requires  that  the  control  electronics  and  deformable  mirror  make  corrections  on  the 
order  of  several  hundred  Hz  or  greater.  These  bandwidths  can  be  very  demanding  specifi¬ 
cations  for  the  wavefront  sensor  and  the  control  electronics.  Wavefront  sensors  are  diverse 
in  size,  power  and  maintenance  requirements.  Choosing  a  wavefront  sensor  often  drives  the 
design  of  the  remainder  of  an  adaptive  optics  system.  Providing  a  new  option  for  wavefront 
sensing  is  the  focus  of  this  research. 


1-4  Wavefront  Sensors 

In  order  to  describe  the  wavefront  sensor  in  further  detail,  it  is  beneficial  to  first  trans¬ 
form  the  figurative  concept  of  a  wavefront  into  a  tractable  mathematical  model.  Consider 
that  the  wavefront  sensor  must  somehow  estimate  the  complex  electromagnetic  field  at  the 
optical  system  entrance  pupil.  The  generalized  pupil  function,  denoted  V,  provides  a  basic 
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mathematical  model  for  the  optical  field  at  the  system  pupil: 


' P(x,y,RP ) 

=  AP(x,  y)WP(x,  y\  RP )  exp(jPp(x,  y)), 

(1.2) 

where  Ap(x,y) 

=  pupil  amplitude  function, 

(1.3) 

Wp(x,  y\  RP) 

=  pupil  windowing  function, 

(1.4) 

and  Pp(x,y) 

=  pupil  phase  function. 

(1.5) 

Ap  represents  the  amplitude  of  the  field,  Wp  is  a  unit  amplitude  windowing  function  used 
to  mask  out  a  circular  aperture  with  radius  Rp ,  and  represents  the  phase  of  the  field. 
The  atmosphere  effects  both  the  amplitude  and  phase  of  the  field  as  it  propagates.  Under 
many  conditions,  however,  the  phase  distortions  create  far  more  pronounced  effects  in  the 
resulting  image.  Furthermore,  although  amplitude  effects  may  be  present,  the  dynamics 
of  those  effects  often  occur  on  a  spatial  scale  greater  than  the  size  of  a  wavefront  sensor 
subaperture,  especially  for  "weak"  turbulence.  Ap  is  relatively  constant  for  such  cases. 
This  is  a  pleasant  characteristic  of  nature  since  amplitude  effects  are  far  more  difficult  to 
compensate.  For  these  reasons,  the  wavefront  sensor  is  designed  to  detect  the  difference 
between  the  wavefront  phase  and  some  reference  phase  function,  Pp,.  Field  amplitude 
is  typically  ignored.  Because  the  wavefront  phase  is  the  quantity  of  interest,  the  terms 
wavefront  and  phasefront  are  often  used  interchangeably  in  the  literature.  The  phase 
function  Pp,,  being  an  error  measure,  is  also  commonly  referred  to  as  the  atmospheric 
aberration  function  or  simply  the  phase  aberration  function.  Unless  specified  otherwise, 
the  term  wavefront  in  this  document  refers  to  the  phase  function  Pp  which  is  assumed  to 
represent  the  difference  in  wavefront  phase  from  some  desired  reference  phase. 


The  wavefront  sensor  must  estimate  Pp  to  some  level  of  precision.  At  optical  fre¬ 
quencies,  only  intensity  can  be  measured  directly,  not  the  held  amplitude  and  phase.  The 
wavefront  sensor  must  then  map  from  intensity  measurements  to  held  measurements.  Some 
sensors  measure  Pp,  through  interferometry.  To  do  so,  a  portion  of  the  incoming  light  is 
used  to  create  a  reference  wavefront  which  is  then  interfered  with  the  original  wavefront. 
Interference  fringes  in  the  intensity  reveal  relative  phase  differences  between  the  reference 
and  the  aberrated  wavefront.  The  self-referencing  Point  Diffraction  Interferometer  (PDI) 


is  an  example  of  this  type  of  wavefront  sensor.  Figure  1.7  provides  a  block  overview  of  the 
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PDI  [61.  Some  wavefront  sensors  measure  the  slope  of  by  interfering  the  wavefront  with 


Figure  1.7  Diagram  of  the  self-referencing  Point  Diffraction  Interferometer. 


a  spatially  shifted  version  of  itself.  The  most  common  example  of  this  interferometer  is  the 
Lateral  Shearing  Interferometer  (LSI).  Figure [L8] provides  a  block  overview  of  the  LSI  [3]. 


Grating 


image 


Figure  1.8  Diagram  of  the  Lateral  Shear  Interferometer. 

Measuring  the  wavefront  phase  without  using  interference  techniques  can  seem  a  bit 
daunting.  The  concept  of  modal  estimation  offers  a  way  to  simplify  the  problem.  Like 
any  other  function,  the  wavefront  phase  has  some  frequency  domain  representation.  Trans¬ 
forming  portions  of  the  frequency  content  into  the  spatial  domain  produces  a  set  of  two- 
dimensional  functions.  These  functions  could  be  referred  to  as  basis  functions  or  modes. 
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Common  Name 

Zernike  Polynomial 

Piston 

1 

x-tilt 

2 r  cos  6 

y-tilt 

2 r  sin  6 

defocus 

V3(2 r2  -  1) 

astigmatism-xy 

y/6 r2  sin  26 

astigmatism 

\/6 r2  cos  26 

Table  1.1  The  first  six  Zernike  polynomials 


The  wavefront  phase  function  may  then  be  approximated  by  a  sum  of  the  ordered  basis 
functions: 

N 

Pip{x,y)  (1.6) 

i=  1 

In  the  case  of  the  modal  estimator,  the  functions  fi(x,y )  are  a  two-dimensional  polynomial 
basis  set,  and  the  coefficients  at  are  weights  applied  to  each  polynomial.  As  IV  — *•  oo,  the 
approximation  becomes  exact.  How  does  this  simplify  the  problem  of  phase  estimation? 
The  modal  estimator  approximates  the  phase  function  as  a  combination  of  only  a  small 
number  of  polynomials.  In  the  case  of  the  tilt  estimator,  only  2  polynomials  are  used. 
Choosing  the  class  of  polynomials  to  use  can  be  crucial.  One  such  set  of  polynomials  is 
the  set  of  Seidel  polynomials.  Seidel  polynomials  are  mentioned  here  because  they  appear 
quite  often  in  the  literature.  Seidel  polynomials  are  used  to  describe  lens  specifications 
for  fabrication.  A  more  convenient  set  of  polynomials  for  measuring  the  aberrations  in  an 
optical  system  are  the  Zernike  polynomials.  The  first  six  Zernike  polynomials  are  listed  in 
Table [Tr] (in  polar  coordinates) .  Notice  that  the  first  Zernike  is  simply  a  phase  delay  applied 
to  the  entire  aperture.  When  comparing  phase  from  multiple  subapertures,  relative  piston 
measurements  can  be  very  helpful,  but  truly  an  engineering  challenge  due  to  the  precision 
required.  The  wavefront  sensors  discussed  in  this  work  will  provide  phase  information  from 
a  single  subaperture  and  therefore  piston  is  neglected  in  the  measured  aberration  function. 
Define  P<f,  to  be  the  piston  removed  phase.  The  piston  removed  phase  can  be  approximated 
as  a  sum  of  scaled  Zernike  polynomials  beginning  with  x-tilt: 

N  /  \ 

P4r,0)^J2a^{^9)-  (!-7) 
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Due  to  the  nature  of  the  atmospheric  induced  phase  aberration,  the  average  Zernike  coef¬ 
ficients  become  successively  smaller  as  the  order  of  the  Zernike  increases  [7|.  In  fact,  the 
average  variance  of  the  tilt  coefficients  will  be  nearly  20  times  greater  than  the  defocus  and 
astigmatism  coefficients.  Under  the  right  conditions,  wavefront  sensors  can  compensate 
for  up  to  86%  of  the  piston  removed  wavefront  phase  error  by  correcting  for  x  and  y-tilt 
only  |3].  Figure  [U9| demonstrates  how  tilt  coefficients  can  be  derived  from  shifted  intensity 
patterns.  The  off-center  location  of  an  Airy  pattern  can  be  translated  into  wavefront  tilt 
by  the  simple  equations: 


9 

6 


X 

y 


tan  1 
Ay 

f  ' 


Ax 

T’ 


(1.8) 

(1.9) 


Many  of  these  tilt  sensors  can  be  combined  together  to  form  a  wavefront  sensor.  Within 


Figure  1.9  Demonstration  of  how  wavefront  tilt  can  be  estimated  from  the  off-center  shift 
of  an  Airy  pattern. 


the  wavefront  sensor,  the  primary  aperture  is  divided  into  a  grid  of  smaller  subapertures 
each  contributing  a  local  tilt  measurement.  The  combination  of  multiple  subaperture  tilt 
measurements  compensates  for  the  lack  of  relative  piston  information.  Using  a  surface 
fitting  algorithm,  the  grid  of  tilt  or  slope  samples  is  used  to  reconstruct  the  wavefront 
phase.  The  resulting  wavefront  is  an  estimate  of  the  actual  wavefront  from  linear  phase 
measurements.  This  type  of  wavefront  sensor  is  commonly  referred  to  as  a  Hartmann  type 
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wavefront  sensor.  A  diagram  of  how  local  wavefront  tilt  estimates  can  be  used  to  reconstruct 


the  wavefront  is  shown  in  Figure  1.10  Modifications  to 
more  sophisticated  versions  of  interferometric  wavefront 


the  Hartmann  wavefront  sensor  and 
sensors  are  discussed  in  Chapter  [4] 


Focal  Plane 


Local  Tilt  Estimates 


Reconstructed  Wavefront 


Figure  1.10  The  Hartmann  type  wavefront  sensor  uses  an  array  of  subapertures  each 
contributing  a  local  tilt  meaurement.  The  local  tilt  measurements  are  ex¬ 
trapolated  to  reconstruct  the  wavefront.  [3j. 


1.5  Research  Contributions 

The  research  contributions  contained  in  this  dissertation  are  motivated  by  the  need 
for  higher  order  modal  estimation  in  real  time  adaptive  optics.  The  first  contribution  is 
a  wavefront  curvature  sensor  that  provides  estimates  of  Zernike  polynomials  Z2  through 
Z\.  The  ^2-4  sensor  estimates  x-tilt,  y-tilt  and  defocus  from  image  projections.  The 
image  projection  reduces  read  out  time  and  CCD  read  noise.  Combining  the  time  savings 
associated  with  image  projection  read  out  and  an  innovative  algorithm  design,  the  ^2-4 
sensor  operates  in  real  time.  The  second  contribution  is  a  curvature  sensor  capable  of 
estimating  Zernike  polynomials  Z 2  through  Z\q.  The  Z2-10  sensor  uses  additional  image 
projections  in  order  to  estimate  curvature  terms  Z5  through  Z\q.  The  third  contribution 
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involves  performance  bounding  for  both  curvature  sensors.  The  Cramer  Rao  lower  bound 
for  estimator  variance  is  used  to  bound  the  performance  of  each  sensor  and  to  provide  insight 
into  design  variable  selection.  The  lower  bound  on  performance  also  serves  to  validate  sensor 
simulation.  Each  sensor  is  simulated  using  von  Karman  phase  aberrations  and  CCD  noise 
modeling.  The  simulation  provides  a  means  to  compare  performance  to  existing  wavefront 
sensor  designs.  The  simulation  is  also  used  to  provide  an  analysis  of  sensor  sensitivity  to 
errors  in  environment  variable  estimates.  The  last  contribution  is  a  unique  implementation 
within  the  atmospheric  simulation.  The  von  Karman  phase  screens  are  generated  using  a 
log-polar  sampled  phase  screen  generator.  Phase  screen  generators  are  commonly  used  in 
atmospheric  turbulence  simulation.  The  log-polar  phase  screen  generation  technique  offers 
improved  isotropy  and  increased  accuracy  over  existing  phase  screen  generation  techniques. 

1.6  Organization 

This  dissertation  is  divided  into  |10|  chapters  including  this  introduction.  This  chapter 
is  meant  to  provide  some  insight  into  adaptive  optics,  the  need  for  wavefront  sensing  and 
a  few  introductory  concepts  required  to  understand  the  major  design  challenges  involved. 
Chapter  [2]  discusses  several  background  concepts  necessary  to  understand  the  derivation 
of  the  tilt  and  curvature  estimators.  The  background  concepts  include  an  introduction 
to  parameter  estimation,  and  atmospheric  turbulence  modeling.  Fourier  optics  concepts 
such  as  the  optical  transfer  function  are  also  discussed.  Chapter  [3]  introduces  a  discrete 
model  for  the  optical  system  and  the  detected  image.  The  noise  model  for  the  sensor 
detector  is  detailed  as  a  random  process  which  leads  to  a  probabilistic  mapping  from  image 
intensity  to  some  set  of  wavefront  modes.  Chapter  [3]  concludes  with  a  description  of  the 
image  projection  operator  notation  and  a  derivation  of  the  parameter  estimator  used  in 
each  sensor.  Chapter  [4]  serves  as  a  literature  review  of  related  research.  It  contains 
a  description  of  the  types  of  curvature  sensing  devices  currently  available.  The  literature 
review  concludes  with  an  outline  of  the  projection  based  tilt  estimator.  Chapter  [5] describes 
in  detail  the  extension  of  the  vector  based  tilt  estimator  required  in  order  to  estimate  the 
defocus  parameter.  Chapter  [6]  provides  a  method  for  bounding  the  performance  of  the 
projection  based  estimator.  Using  the  performance  bound  as  a  metric  for  determining 
ideal  design  configurations  is  demonstrated.  The  performance  bound  is  computed  for  both 
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the  Z2-A  and  the  Z2-10  sensors  under  a  typical  range  of  operating  parameters.  Chapter  [7] 
outlines  an  existing  method  for  generating  random  realizations  of  atmospheric  phase.  The 
phase  screen  generator  is  an  essential  part  of  the  wavefront  sensor  simulation.  The  polar 
sampled  phase  screen  generation  technique  is  described  in  detail.  Chapter  [7]  concludes 
with  a  performance  comparison  between  the  polar  phase  screen  generator  and  an  existing 
phase  screen  generation  technique.  Chapter  [8]  outlines  the  techniques  for  simulating  the 
projection  based  Z2-&  curvature  sensor  and  provides  a  summary  of  the  sensor  simulation 
results.  The  Z2-4  curvature  sensor  is  compared  to  its  lower  bound  and  a  simulation  of 
existing  tilt  sensor  designs.  A  sensitivity  analysis  is  also  performed  in  order  to  demonstrate 
the  robustness  of  the  sensor  to  erroneous  environmental  variable  estimates.  Chapter  [9] 
provides  an  overview  of  the  z>2-io  sensor  design  and  concludes  with  simulated  performance, 
a  comparison  to  existing  tilt  sensor  designs,  and  a  sensitivity  analysis. 
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2.  Background 

The  projection  based  wavefront  curvature  sensors  presented  in  this  dissertation  are  essen¬ 
tially  parameter  estimators.  In  order  to  facilitate  a  better  understanding  of  the  wavefront 
sensor  designs,  I  will  begin  by  providing  background  material  in  this  chapter.  This  material 
is  essential  to  highlight  the  set  of  fundamental  principles,  any  assumptions  that  I  have  ap¬ 
plied,  and  the  mathematical  motivation  behind  the  design  and  simulation  of  the  wavefront 
curvature  sensors.  The  background  begins  with  a  review  of  the  Bayesian  estimator  and 
performance  bounds.  A  discussion  of  Kolmogorov’s  turbulence  model  and  its  application  to 
atmospheric  dynamics  follows.  From  there,  the  atmospheric  dynamics  are  parameterized. 
This  provides  a  set  of  atmospheric  phase  characteristics  to  be  estimated  along  with  their 
statistics:  the  crucial  link  between  the  random  nature  of  the  atmosphere  and  some  finite  set 
of  parameters.  Finally,  the  sensor’s  intensity  measurements  are  linked  to  the  field  phase 
characteristics  (the  parameter  set)  via  a  linear  optics  model.  The  concept  of  an  optical 
transfer  function  (OTF)  will  be  the  final  ingredient  that  offers  a  method  for  mapping  from 
the  sensor’s  observation  space  to  a  small  set  of  atmospheric  parameters. 


2.1  Parameter  Estimation 

The  following  parameter  estimation  background  follows  the  treatment  from  Van  Trees 
[8].  Consider  an  experiment  where  some  observed  quantity,  R,  is  the  outcome  when  the 
environment  is  influenced  by  some  parameter,  A.  Merely  making  an  observation  may  not 
reveal  the  exact  parameter  or  set  of  parameters  which  led  to  the  observed  environment. 
However,  given  an  observation  and  some  knowledge  about  the  experiment,  one  may  guess 
at  the  parameters.  Prior  knowledge  about  the  experiment  typically  consists  of  a  proba¬ 
bilistic  mapping,  pr|a(R|A),  from  the  parameter  space  to  the  observation  space.  Parameter 
estimation  will  replace  "guessing"  or,  more  formally,  forming  a  probabilistic  map  from  ob¬ 
servation  space,  R,  to  a  parameter  estimate,  A.  The  map  from  the  observation  to  the 


estimate  is  called  an  estimation  rule,  a(R).  The  diagram  in  Figure  2.1  describes  the  esti¬ 


mation  model.  Note  several  variable  naming  conventions:  lower  case  letters  denote  random 
variables,  upper  case  letters  indicate  instances  of  random  variables  or  nonrandom  quanti¬ 
ties,  bold  letters  indicate  vector  quantities,  and  a  carat  indicates  the  estimate  of  a  quantity. 


Table  2.1  defines  several  likelihood  expressions  that  will  be  used  in  this  section. 
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Expression 

Description 

Va{A) 

probability  density  for  A 

Pr,a(R)  A) 

joint  density  for  A  and  R 

Pr\a(R\A) 

conditional  density  for  R  given  A 

Pa\r  ( A  -tv )  —  pr(R) 

Bayes’  Rule,  a  useful  identity 

a  posteriori  density  = 
conditional  likelihood  X  a  priori  density 
marginal  density 

defining  the  terms  in  Bayes’  Rule 

Table  2.1  Useful  definitions  from  estimation  theory. 


The  estimation  rule  should  result  in  a  parameter  estimate  that  minimizes  risk,  1Z. 
Risk  is  defined  to  be  the  expected  value  of  a  predefined  cost  function,  C : 


n  = 

E{C[a,a(  R)]}, 

oo  oo 

(2.1) 

n  = 

1  dA  j  dRC[a,o(R)]po,r(i4,R), 

— OO  — OO 

(2.2) 

n  = 

OO  OO 

j  dRpr (R)  J  dAC[a,a(R)]pa\r(A\R). 

(2.3) 

Since  cost  is  subjective,  the  cost  function  selected  may  vary.  The  purpose  of  the  cost 
function  is  to  assign  some  penalty  to  error  in  the  estimate: 


A 

where  A 
and  A 


A  -  A, 

realization  of  the  random  parameter,  a, 
estimate  of  A. 


(2.4) 


A  few  common  cost  functions  are  shown  in  Figure  [2~2]  From  left  to  right  the  example  plots 
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Figure  2.1  The  estimation  model  |8j. 


Figure  2.2  Example  cost  functions. 


demonstrate  quadratic,  linear,  and  uniform  cost  functions. 

Assume  a  uniform  cost  function  where  the  cost  of  error  is  unity  outside  some  region, 
A,  and  the  cost  of  error  is  zero  within  the  region  A.  Given  uniform  cost,  the  risk  becomes: 


n. 


unf 


n. 


unf 


oo 

J  dRpr(R) 


dRpr(R) 


J  dApa\r(A\R.)  +  j  dApa\r(A\R)  ,  (2.5) 

-°°  aun/(R)+f 

aUII/(R-)+f 

1  -  J  dApa\r(A\R)  .  (2.6) 

a unf  (-R-)  2 


Minimizing  risk,  in  this  case,  means  choosing  the  estimation  rule,  aunf( R)  =  A,  such  that 


the  inner  integral  in  (2.6 )  is  maximized.  Now  consider  the  limiting  case  where  the  region  A 
in  the  cost  function  approaches  some  arbitrarily  small  nonzero  value.  In  the  limit,  the  inner 
integral  is  maximized  when  aun/(R)  equals  the  parameter  that  maximizes  the  a  posteriori 
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density,  pa|r(A|R): 


min  f  lim 


A  A  0 


n. 


unf 


(2.7) 


A—au 


The  notation,  m!n  /  (A)  ,  is  used  as  a  compact  form  for  the  expression, 

A=a 


min 

a  =  A:f(A)=  A  f(A), 


(2.8) 


which  means  that  a  is  set  equal  to  the  value  of  the  input  variable  A  such  that  the  function 


/  is  minimized  over  A.  Substituting  (2.6)  into  (2.7)  and  simplifying: 


dRpr(R) 


dRpr(R) 


1  - 


max 

A 


1  - 


max 


A 


f(R)  + 

j  dApa\r(A\R) 

& unf{ ^0 

O' unf 

{Pa|r(-4|R)}  J  dA 

® un  /(R)- 


A=au 


A=a. 


unf 


max 

A 


{Pa|r(^|R)} 


A — CLunf  — ®map 


(2.9) 

(2.10) 

(2.11) 


This  estimation  rule,  denoted  amap(R),  is  commonly  known  as  the  maximum  a  posteriori 
(MAP)  estimator.  If  the  a  posteriori  density  is  continuous  and  has  first  partial  derivatives 
then  the  MAP  estimator  can  be  found  by  solving  for  the  function  maximum  in  the  usual 
manner.  Furthermore,  since  the  a  posteriori  density  is  necessarily  monotonic,  its  maximum 
and  the  maximum  of  its  natural  logarithm  will  both  occur  at  the  same  value  of  A.  This  is 


advantageous  because  applying  Bayes’  Rule  (see  Table  2.1 )  and  taking  the  natural  logarithm 


allows  for  convenient  simplification  of  the  MAP  estimator  expression.  Begin  by  solving  for 
the  critical  point  (the  maximum  value  in  this  case)  of  a  function  in  the  typical  manner. 
Take  the  first  derivative  and  set  the  result  equal  to  zero: 


(Pa|r(^|R)} 


max 

A 


Qj^Va\rA\^) 


A=c, 


=  0. 


(2.12) 

(2.13) 


A=a 


Since  the  variable  A  at  which  the  a  posteriori  density  is  maximized  is  also  the  point  at 
which  the  logarithm  of  the  a  posteriori  density  is  maximized,  substitute  in  the  logarithm  of 
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the  a  posteriori  density: 


^ln{pa|r(^|R)} 


(2.14) 


=  0. 


A=ar 7 


Now  apply  Bayes’  Rule  and  evaluate  the  logarithm: 


d  f  pr\a(R\A)pa{A)  \ 
cMn\  pr(  R)  J 

^  lnpr|a(R|^)  +  In pa(A)  -  lnpr(R) 


A — CLmap 


A — CLmap 


0, 

0. 


(2.15) 

(2.16) 


Note  that  the  derivative  of  the  marginal  density  with  respect  to  the  parameter  is  zero. 
Removing  the  dependence  on  the  marginal  density  gives: 


d 

dA 


lnpr|a(R|^4)  + 


d 

dA 


In  pa(  A) 


=  0. 


A=a 


map 


Once  again,  taking  the  first  derivative,  setting  the  result  equal  to  zero  and  solving  for  the 
variable  A  is  equivalent  to  maximizing  the  sum  of  logarithms  of  the  conditional  and  the  a 
priori  densities.  Rewriting  the  differential  expression  above  as  a  maximization  yields: 


.  (2.17) 

AA — CLmap 

From  this  result,  it  is  easy  to  see  that  there  are  two  probabilistic  mappings  that  are  required 
to  form  the  MAP  estimator.  The  first  term  is  the  conditional  probability  of  the  observation 
given  some  set  of  parameters,  pr|a(R|A),  and  the  second  term  is  the  a  priori  probability 
distribution  of  the  parameter  space,  pa{A).  Unfortunately,  many  cases  arise  where  the 
parameter  a  priori  probability  is  unknown.  In  these  cases,  it  is  common  to  define  some 
range  for  the  parameter  and  then  assume  a  uniform  probability  distribution  within  the 
range.  If  the  a  priori  density  is  constant  then  its  partial  with  respect  to  the  parameter  A 
is  zero  and  the  expression  for  a  becomes  simpler  still: 


max 

A 


{lnpr|a(R|A)  +  lnpa(A)} 


d 

dA 


lnpr|a(R|"4) 


A — CLml 


max 

A 


{in  pr|a(R|A)} 


^ ml 


0, 


(2.18) 

(2.19) 
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The  estimator  in  this  case  is  often  referred  to  as  the  Maximum  Likelihood  (ML)  estimator, 
denoted  amj. 

Now  suppose  that  it  is  necessary  to  measure  the  level  of  performance  of  the  estimator. 
A  common  method  for  evaluating  estimator  performance,  called  the  Monte  Carlo  method, 
involves  simulating  or  conducting  many  experiments  and  evaluating  the  variance  of  the 
estimator  over  a  large  sample  of  observations.  This  method  offers  an  estimate  of  the 
estimator  variance.  However,  the  estimated  variance  is  simply  a  number.  It  may  also 
be  useful  to  know  how  the  sample  variance  compares  to  a  theoretical  lower  bound.  The 
Cramer-Rao  lower  bound  (CRLB)  provides  a  benchmark  for  the  lowest  achievable  estimator 
mean  squared  error.  Van  Trees  provides  derivations  of  the  CRLB  for  both  the  single  and 
multiple  parameter  cases  j8j.  The  CRLB  on  mean  squared  error  for  any  unbiased  estimator 
is  presented  here  in  two  forms: 


E  |  (a(R)  —  a)2  }  >  — 

1  J  E 


£{^lnpr,a(R,A)}’ 


1 

{[^lnPr,a(R,  A)]2 

1 


(2.20) 

(2.21) 


where  the  expectation  is  taken  over  both  a  and  r.  The  term  unbiased  indicates  that  the 
mean  or  expected  value  of  the  estimator  equals  the  true  parameter:  Li{a(R)}  =  A.  If 
the  parameter  is  nonrandom  or  if  the  parameter  is  given  an  assumed  uniform  pdf,  then  the 
CRLB  simplifies: 


£{(a(R)-A)2}  > 

> 

When  the  variance  of  an  estimator  is  equal  to  the  CRLB,  then  the  estimator  is  efficient. 

If  an  estimator  is  biased,  then  the  CRLB  above  does  not  apply.  The  Cramer-Rao 
inequality  for  biased  estimators  is  sometimes  referred  to  as  the  lower  bound  on  mean  squared 


1 

E  {^1  hrpr|a(R|A)| 


(2.22) 

(2.23) 
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error: 


E{a(  R)} 
E  {(a(R)  —  a)2} 


=  A  +  B(A), 

£{[^hrpr,a(R,A)]2}' 


(2.24) 

(2.25) 


A  CRLB  exists  for  multiple  parameter  cases.  Assume  K  parameters,  the  CRLB  has  the 
following  form  (once  again  for  unbiased  estimates): 


.E  j(aj(R)  -  aj)2}  >  J",  (2.26) 

where  J”  is  the  rith  element  in  the  K  x  K  square  matrix,  J^1.  is  defined  as  follows: 

J  t  =  Jr>  +  Jp,  (2.27) 

Jd„  =  ~E{'gA~dAj  ■  (2-28) 

<2'29) 


Thus  the  MAP  and  ML  estimators  offer  methods  for  minimizing  the  risk  associated 
with  approximating  parameters  from  experimental  observations.  The  caveat  is  that  some¬ 
thing  must  be  known  about  the  environment.  In  either  case,  a  probabilistic  map  of  the 
parameter  space,  given  some  observation  pr|a(R|A),  must  be  known.  The  MAP  estimator 
requires  an  a  priori  probability  for  the  estimated  parameter(s)  as  well.  This  begins  with 
generating  an  accurate,  yet  tractable,  model  for  the  experiment.  In  the  case  of  the  wave- 
front  sensor  problem  presented  here,  it  is  necessary  to  develop  models  for  the  atmospheric 
turbulence  and  detector  noise.  I  will  begin  with  the  turbulence  model. 


2.2  Turbulence  Modeling 

From  the  description  of  atmospheric  turbulence  provided  in  the  introduction,  the 
random  nature  of  the  index  of  refraction  leads  to  optical  system  performance  far  worse  than 
the  limits  imposed  by  diffraction  effects.  This  section  provides  a  review  of  the  most  common 
model  for  atmospheric  fluctuations  in  index  of  refraction  and  the  assumptions  inherent  in 
the  model.  Once  the  model  for  index  of  refraction  is  established,  it  is  transformed  into 
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a  more  useful  phase  model.  The  importance  of  a  phase  model  as  opposed  to  an  index  of 
refraction  model  should  be  evident  from  the  brief  discussion  in  Section  |1.2|  concerning  the 
optical  wavefront.  Recall  that  the  wavefront  model  presents  the  aberration  function  as  a 
relative  phase  difference  between  the  wavefront  in  the  system  aperture  and  some  reference 
wavefront.  A  wavefront  sensor  must  detect  and  compensate  for  this  atmospheric  phase 
distortion.  In  order  to  design  such  a  device,  a  keen  understanding  of  the  nature  of  the 
phase  distortion  is  required  along  with  a  tractable  model  for  use  in  simulation  and  testing 
of  the  sensor  design.  The  most  popular  place  to  begin  deriving  such  an  atmospheric  model 
is  from  the  research  contributions  of  A.  N.  Kolmogorov. 

In  the  1920s  and  30s,  Andrei  Nikolaevich  Kolmogorov  made  significant  contributions 
to  mathematics  in  the  area  of  probability  theory  and  function  spaces.  These  accomplish¬ 
ments  led  to  an  applied  mathematical  treatment  concerning  the  turbulent  motion  of  fluids 
[9j.  Kolmogorov  hypothesized  a  2/3  power  law  for  the  mean  square  difference  in  velocity 
between  two  points  (often  referred  to  as  a  structure  function )  in  an  isotropic,  homogeneous 
medium.  The  terms  isotropic  and  homogeneous  refer  to  the  spatial  statistics  of  the  fluid. 
Homogeneous  means  that  the  statistical  moments  are  only  a  function  of  the  displacement 
vector  between  the  two  points  of  interest  and  not  the  location  of  either  point.  The  term 
isotropic  further  restricts  the  spatial  statistics  to  depend  only  on  the  magnitude  of  the  dis¬ 
placement  vector  without  regard  for  the  displacement  direction.  Kolmogorov’s  velocity 
structure  function  was  of  the  form  [S] : 

Dv  (Ri,R)  =  £{[v(Ri  +  R)  -u(Ri)]2}  ,  (2.30) 

DV{R)  =  ClR 2/3.  (2.31) 

Where  C%  is  the  velocity  structure  function  constant.  This  structure  function  applies  to  a 
region  in  the  fluid  called  the  inertial  range.  The  inertial  range  is  confined  to  a  separation 
of  points  less  than  the  outer  scale  and  greater  than  the  inner  scale.  The  outer  scale  is  the 
separation  distance  beyond  which  the  turbulent  motion  is  no  longer  considered  isotropic. 
For  the  purpose  of  this  research,  the  atmosphere  is  the  fluid  of  interest.  Near  the  Earth’s 
surface,  the  atmospheric  outer  scale  is  generally  considered  equal  to  the  height  above  the 
ground.  The  outer  scale  at  higher  altitudes  is  often  estimated  in  the  10’s  of  meters.  The 
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inner  scale  is  the  separation  distance  where  the  turbulence  gives  way  to  molecular  friction. 
Reasonable  values  for  atmospheric  inner  scale  are  on  the  order  of  a  few  millimeters  to  15 
cm. 

Kolmogorov’s  power  law  provides  a  statistical  model  for  relative  particle  velocity.  It 
is  necessary  to  extend  this  statistic  to  the  atmospheric  index  of  refraction,  n.  The  first 
part  of  this  extension  lies  in  finding  an  expression  for  index  of  refraction  that  relates  it  to 
particle  velocity,  for  which  Kolmogorov’s  statistic  applies.  The  second  critical  step  involves 
a  contribution  by  Corrsin  concerning  the  concept  of  a  conservative  passive  additive  m- 
The  index  of  refraction  of  air  depends  on  density  which  is  largely  a  function  of  temperature, 
pressure  and  humidity.  The  approximate  expression  for  index  of  refraction  at  optical 
wavelengths,  excluding  humidity  effects,  is  given  by  Andrews  HH: 

n  =  1  +  7.76  x  10“7(1  +  7.52  x  10~3A“2)^,  (2.32) 

7P 

n  «  1  +  7.9  X  10~7  — .  (2.33) 

The  approximated  n  includes  an  assumed  wavelength  in  the  optical  band:  A  =  0.5  x  10-6m. 
Now  examine  the  differential: 


5n  = 

7  P  (6P 
7-9  x  HD7- 

5n  sa 

-7.9  x  10 ~7^5T. 

ST\ 

~T) 


(2.34) 

(2.35) 


This  last  approximation  results  from  the  fact  that,  at  optical  frequencies,  temperature  effects 
dominate  the  fluctuations  in  n  and  therefore  pressure  effects  can  be  ignored  [5].  Corrsin 
explains  that  quantities  can  be  categorized  as  conservative  if  they  are  not  dependent  on  a 
position  in  space.  He  further  notes  that  a  passive  quantity  bears  the  same  atmospheric 
statistics  regardless  of  position.  Given  that  conservative  passive  additives  do  not  effect  the 
turbulence  statistics,  they  obey  the  same  2/3  power  established  for  velocity  fluctuations. 
Temperature  is  not  a  conservative  quantity  in  general,  because  it  is  dependent  on  altitude. 
Consider  potential  temperature,  however,  or  temperature  about  a  specific  altitude.  Poten¬ 
tial  temperature  is  a  conservative  quantity.  Define  potential  temperature,  as  follows 
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°c 

T  -  9.8- — 
km 


(2.36) 

(2.37) 

(2.38) 


m- 


Cjjjn  = 

5$t  =  5T, 

-7  P  , 

6n  «  —7.9  x  10 

Now  it  is  obvious  that  index  of  refraction  fluctuations  bear  a  direct  relationship  to  potential 
temperature  fluctuations  making  potential  temperature  a  passive  quantity.  Since  is 
a  conservative  passive  additive,  its  structure  function  obeys  the  same  2/3  power  law  as 
velocity: 

D$t  (_R)  =  .  (2.39) 

It  follows  then  that  n  follows  a  2/3  power  law  as  well,  thus  the  desired  statistic  is  given: 

Dn(R)  =  C2nm.  (2.40) 

It  is  necessary  to  transform  this  spatial  statistic  into  a  spectral  representation.  An 
expression  for  the  power  spectral  density  is  necessary  in  order  to  describe  the  process  spec¬ 
trally.  Transforming  the  structure  function  into  a  power  spectrum  is  made  possible  via  the 


Fourier-Stieltjes  integral  [ill]: 

x(t) 

OO 

=  J  e]Ultdv(uj), 

—  OO 

(2.41) 

v(u) 

OO 

=  ^  /  e-^x(t)dt, 

(2.42) 

x(t) 

— OO 

=  spatial  or  temporal  correlation, 

(2.43) 

dv(uj) 

=  infinitesimal  spectral  band. 

(2.44) 

Solving  the  integral  will  require  working  with  the  correlation,  Bn ,  rather  than  the  structure 
function,  Dn.  Recall  the  following  relationship  between  the  structure  function  and  the 
correlation  for  a  homogeneous  random  process  HU: 

Bn( o)  -  Bn{R)  =  ^Dn(R).  (2.45) 
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It  will  be  helpful  to  extend  the  property  of  statistical  homogeneity  into  the  temporal  domain 
by  assuming  that  the  atmospheric  statistics  do  not  vary  with  time.  When  the  temporal 
moments  of  a  random  process  do  not  vary  with  time,  the  process  is  considered  stationary. 
In  order  to  assume  stationarity,  a  temporal  quality,  it  is  necessary  to  assume  ergodicity. 
Assuming  that  the  atmospheric  statistics  are  ergodic  is  to  assume  that  taking  many  random 
samples  of  the  atmosphere  in  different  locations  will  yield  the  same  statistics  as  sampling 
the  same  location  over  many  time  instances.  In  other  words,  the  temporal  statistics  are  the 
same  as  the  spatial  statistics,  in  a  mean  square  sense.  Finally,  this  derivation  will  require 
a  transform  pair  between  the  spatial  correlation  function  and  the  spectral  density.  To  this 
purpose,  Bn(R )  must  be  band  limited  in  order  to  ensure  that  the  inverse  transform  exists. 
Substituting  Bn  into  the  Fourier-Stieltjes  integral  transform  and  simplifying  will  require 
some  mathematical  rigor.  Both  Andrews  im  and  Strohbehn  |5j  provide  a  more  detailed 
version  of  the  derivation.  The  following  summarizes  the  treatment  from  Strohbehn. 

Begin  by  defining  a  zero  mean  index  of  refraction  random  variable,  n\: 


n(R)  =  E  (n(R)}  +  ni(R) 
_E{ni(R)}  =  0. 


(2.46) 

(2.47) 


Apply  the  Fourier-Stieltjes  integral  to  the  zero  mean  random  variable,  n\\ 


OO 


(2.48) 


—  OO 


where  K  =  (Kx,  Ky,  Kz)  is  the  three-dimensional  spatial  wave  number  and  dN( K)  is  some 
small  spectral  harmonic  of  the  zero  mean  index  of  refraction.  I  am  only  interested  in 
the  zero  mean  random  process  ni,  not  n.  However,  for  notational  simplicity,  I  would  like 
to  retain  the  variable  n  and  dispense  with  the  subscripted  variable  n\.  For  this  reason, 
the  reader  may  assume  that  all  subsequent  references  to  "the  index  of  refraction"  and  the 
variable  n  are  indeed  referring  to  the  zero  mean  random  process.  Writing  the  correlation 
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for  the  index  of  refraction: 


E{n(  RiK(R2)} 

Bn  (Ri,  R2) 


i?n(Rl,  R2), 

[  dN( Ki)eJKl'Rlx 
=  E  (  f  ,-  -1  * 

'  dN{ K2)ejK2'R2 


, 


£  j  JJ  diV(Ki)diV:,,(K2)ej(Kl  Rl~K2'R2)|  , 
J I  £7{dJV(Ki)d7V*(K2)}ei{Kl'Rl_Ka'Ra). 


(2.49) 

(2.50) 

(2.51) 

(2.52) 


Making  the  substitutions  R2  =  Ri  +  R: 


Rn(Ri,R1+R)  =  JJ  £i{diV(Ki)dAr*(K2)}ej(Kl'Rl-K2-(Rl+R)). 


(2.53) 


According  to  the  assumptions  of  stationarity  and  ergodicity  of  the  process:  Rn(Ri+R,  Rt)  = 
Rn(R).  The  only  form  for  the  right  hand  side  for  which  the  correlation  will  be  independent 
of  Ri  is  to  force  the  index  spectrum  to  be  delta  correlated  on  K: 


E  {dN (Ki)dN* (K2)}  =  S{ Kx  -  K2)$„(K2)d3  Kid3  K2. 


Substituting  (2.54)  into  (2.52)  and  evaluating  the  K2  integral: 


Bn{ R)  =  J  d3K.ie^Kl'K  J  <5(Ki  —  K2)4>ra(K2)d3K2e“'’K2'R, 
Bn( R)  =  [  d3K1eiKl'R$n(K1). 


Thus,  the  following  three-dimensional  Fourier  pair: 


Bn( R)  =  /  ejKR$n(K)d3K, 


4»n(K)  =  ^3  J  e~iK  RBn(R)d3 R. 


(2.54) 


(2.55) 

(2.56) 


(2.57) 

(2.58) 
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Recalling  that  the  process  is  also  isotropic,  it  is  possible  to  further  simplify  the  expression 
by  converting  to  spherical  coordinates: 


Bn(R) 

$n(«) 

where  K 
and  d3  K 


4  TT 
~R 


OO 

J  dn  sin (K,R)&n(n)n, 

o 


OO 


dRR  sin(fc  R)Bn(R), 


k2  sin  8d9d(f)dK. 


(2.59) 

(2.60) 

(2.61) 

(2.62) 


Combining  the  isotropic  form  for  Bn(R)  in  (2.59)  and  (2.45),  the  structure  function  can  be 
expressed  in  terms  of  the  spectral  density: 


OO 

/' 


Dn(R )  =8ir  dKK2&n(n) 


1  - 


sin  (kR) 
kR 


(2.63) 


Taking  the  inverse  Fourier  transform  of  (2.63)  Strohbehn  provides  the  very  important  result 


m 


$n(«) 

$n(«) 

^n(«) 


1 


47T2K2 


sin  (kR)  d 
kR  dR 


->2  d 


RdRD"{R) 


dR, 


L0 


187T2 


C2k  3  /  sin (kR)R  l^3dR, 


lo 

omsc2*-11'3, 

.  i  i 

when  — —  «  k  «  — . 
Bo  l  o 


(2.64) 


(2.65) 

(2.66) 

(2.67) 


These  results  provide  tractable  models  for  index  of  refraction  in  both  the  spatial,  (2.40), 


and  spatial  frequency,  (2.66),  domains.  At  this  point,  it  would  also  be  beneficial  to  trans¬ 
form  the  results  for  index  of  refraction  into  a  phase  structure  function  and  phase  spectrum 
respectively.  The  transformation  from  index  of  refraction  to  phase  models  follows  in  the 


next  section.  Table  2.2  highlights  the  assumptions  that  were  required  to  arrive  at  (2.40) 


and  (2.66). 
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Assumption 

Explanation 

1. 

homogeneity 

Atmospheric  difference  statistics  depend  only  on  the 
displacement  vector  and  not  its  location  in  space. 

2. 

isotropy 

Difference  statistics  are  dependent  only  on  the 
magnitude  of  the  displacement  vector. 

3. 

n  solely  dependent 
on  temperature 

Neglect  effects  of  humidity  and  pressure  on  n. 

4. 

temperature 

is  conservative 

Large  distance  temperature  effects  are  lumped 
into  an  atmospheric  profile  constant  C%. 

5. 

ergodicity 

Turbulence  evolution  time  scale  is  long  when 
compared  to  wind  flow  across  the  aperture. 

Spatial  and  temporal  statistics  are  the  same. 

6. 

band  limited 

turbulence 

Distant  particle  velocities  become  uncorrelated. 

Table  2.2  Assumptions  required  to  derive  the  index  of  refraction  spectrum  from  Kol¬ 
mogorov  velocity  structure  function. 
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The  phase  spectrum  will  serve  two  purposes.  The  first  useful  property  of  the  phase 
spectrum  is  that  it  is  necessary  for  deriving  the  expected  power  in  each  Zernike  mode.  A 


detailed  discussion  of  Zernike  mode  variance  is  included  in  Section  2.3  The  power  in 


each  Zernike  mode  is  also  a  statistical  variance  which  is  essential  for  approximating  pa{A) 
and  the  effective  operation  of  the  parameter  based  wavefront  sensor.  The  second  property 
is  that  the  phase  spectrum  provides  a  fast,  efficient  method  for  simulating  atmospheric 


turbulence.  Assumption  number  6  in  Table  2.2  suggests  making  one  more  modification  to 
the  Kolmogorov  spectrum  before  using  it  to  create  a  phase  spectrum.  The  Kolmogorov 
spectrum  should  include  parameters  for  the  inner  and  outer  scale.  Doing  so  effectively  band 


limits  the  spectrum  by  removing  the  singularity  in  (2.66)  at  k  =  0  and  adding  exponential 
roll-off  at  high  frequencies.  Making  these  changes  will  force  the  spectrum  to  be  absolutely 
integrable  while  still  matching  the  Kolmogorov  model  within  the  inertial  range.  The 
following  refractive  index  power  spectrum  is  referred  to  as  the  von  Karman  spectrum: 


$n(«0 

=  0.033(7^ 

where  0 

<  K  <  OO, 

^ m 

=  5.92/Zo, 

and  k  0 

=  1/Lq. 

(k2  +  k2)11/6’ 


(2.68) 

(2.69) 

(2.70) 

(2.71) 


Transforming  the  index  of  refraction  spectrum  into  a  phase  spectrum  begins  by  ap¬ 
plying  thin  screen  theory  m-  In  the  context  of  thin  screen  theory,  turbulence  effects  are 
condensed  into  a  thin  screen  such  that  only  the  phase  (not  the  amplitude)  of  the  propagat¬ 
ing  field  is  modulated  by  the  screen.  For  instance,  the  field  at  the  entrance  and  exit  of  a 
thin  screen  may  be  represented  as  V\  and  V2  respectively: 

Vi(x,y)  =  exp  (jP^(x,y)),  (2.72) 

V2(x,y)  =  exp  (j[P^{x,y)  +  P<j>(x,y))).  (2.73) 

Note  that  the  phase  screen  has  unit  amplitude  and  a  phase  function  denoted  P(p(x,y). 
Within  this  theory,  a  column  of  atmosphere  could  be  simulated  by  multiple  thin  phase 
screens  each  separated  by  free  space  as  shown  in  Figure  [273}  The  field  propagating  between 
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Figure  2.3  Atmospheric  turbulence  divided  into  discrete  layers  and  modeled  as  a  series 
of  thin  phase  screens. 


these  thin  screens  experiences  fluctuation  in  both  phase  and  amplitude.  Within  the  model 
described  by  Figure  [273]  the  thin  screen  phase  spectrum  may  be  constructed  from  a  piecewise 
integral  of  the  index  of  refraction  spectrum  multiplied  by  the  wave  number  m- 


zq+Az 


d* (Hx  j  Ky)  —  27 lk  j  4?n  ,  Ky,  Kz  —  0, 


20 


(2.74) 


Substituting  the  von  Karrnan  spectrum  for  4>n  provides  the  thin  phase  screen  phase  spec¬ 
trum: 


$P*(«r) 


where  Kr 


zo+Az 

2irk2  (0.033)  (k2  +  Kg)~n/6  exp  ^-4 r1^  J  C2(£)d£, 

20 


(2.75) 

(2.76) 


It  is  convenient  to  simplify  the  expression  for  by  introducing  a  constant  called  the 
coherence  diameter,  rg.  rg  is  also  commonly  referred  to  as  the  Fried  parameter  [15].  The 


2-16 


Fried  parameter  accounts  for  the  integrated  C2  and  the  wavelength  of  interest: 

/ 


r0  =  0.185 


\  ® 


A2 


z0+Az 

I  <%{£)<% 

ZQ  / 


(2.77) 


Where  the  coefficient  0.185  is  an  approximation: 

(  5r(i)  V 


24  /6 

4jrlr(l)  I  V5  V  5 


0.185. 


The  Fried  parameter  offers  insight  as  well  as  a  compact  notation.  As  a  general  rule,  ro 
describes  the  strength  of  optical  turbulence:  as  ro  increases,  the  strength  of  the  turbulence 
decreases.  It  represents  the  spatial  dimension  for  an  optical  system  aperture  beyond  which 
the  resolving  power  advantages  typically  associated  with  increasing  aperture  diameter  give 
way  to  turbulence  effects.  In  other  words,  increasing  the  size  of  the  aperture  beyond 
ro,  while  increasing  the  amount  of  light  entering  the  system,  does  not  improve  resolution. 


Substituting  ro  into  (2.75),  provides  the  following  expression  for  the  thin  screen  phase 
spectrum: 


<f>p, M  =  0.4898r0  5/3(«2  +  ^)“11/6 


exp 


—  KZ 


(2.78) 


Substituting  the  thin  screen  phase  spectrum  into  the  Fourier-Stieltjes  integral  yields  the 
corresponding  thin  phase  structure  function.  Begin  with  a  convenient  form  of  the  transform 


integral  in  (2.63)  provided  by  Tatarskii  m-  This  form  of  the  integral  assumes  plane  wave 
propagation  and  local  isotropy.  Also,  it  is  convenient  to  remove  the  subscript  r  on  k  for 
notational  simplicity.  Remember  that  n,  in  this  case,  is  only  varying  in  an  infinitesimally 
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thin  slice  of  the  atmosphere  perpendicular  to  the  path  of  propagation: 


Dp^RxiRy,  0) 

DpJR) 


—  I  I  1  COS(KXRX  HiyRy')]  1  tydHiXClKy  , 

— oo 
oo 

47t  J  dun  [1  —  Jq{k,R )]  0.4898r0 


(2.79) 


,-5/3 


{k2  +  Kq)  n^6  exp  ^ 


—  K 


0.4898r0  5/34tt 


/dv4>“''  exp  tf)  ~ 
°°  /_  2 
/  dK(Moy^J^R)  exp 


(2.80) 

(2.81) 


Note  that  the  symbol  Jo  represents  the  Bessel  function  of  the  first  kind.  Examining  (2.81 ) 
the  first  integral  can  be  computed  in  closed  form  using  a  table  from  Andrews  mi: 


exp 


{4f} 


1 


T 


1  1  Ki 


(k2  +  kq)11/6  =  + 

=  ^sr(/i+i)x 

r(t-Ai)  r(/x-|) 

f(t)  r(M  +  i)V< 


(2.82) 


(2.83) 


Where  the  function  symbol,  U (a;  c;  z)  represents  a  confluent  hypergeometric  function  of  the 
second  kind.  Substituting  n  =  \  yields  the  following  result: 


OC 

/ 


d,,^  exp{^rl  ~  1  -s 

(At2  +  Kq)!!/6  -  2  ° 


im  +  r(-^) 
r(f)  1  6jU^ 


(2.84) 


Approximating  the  effects  of  the  third  hypergeometric  term,  =  35  0°5^2 ,  to  be  nearly  zero 
yields  the  following  result: 


dnn 


exp 


{=£} 

l  J 


1  -I 


1 


(«2  +  (t g)H/e  2'‘"3,7(1;6;~0)  “  2“"  r(W“5"'cl  ' 


i.-5r(§)  ~3..-S 


=  -ACn 


(2.85) 


The  second  integral  in  (2.81 )  presents  a  slight  problem  since  it  has  no  analytical  solution.  In 
order  to  simplify  the  integrand  to  a  form  that  will  provide  an  analytical  result,  the  inner  scale 
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term  must  be  removed.  The  are  a  few  items  to  consider  when  determining  whether  such  an 
assumption  is  viable.  The  primary  purpose  of  this  derivation  is  for  computer  phase  screen 
simulation.  For  the  purposes  of  computer  modeling,  the  inner  scale  effects  will  be  negligible 
as  long  as  the  smallest  spatial  sampling  in  the  phase  screen  is  on  the  order  of  Iq  or  larger. 
In  this  case,  the  roll-off  in  the  structure  function  due  to  nm  may  go  unnoticed.  A  plot 
comparison  of  the  analytical  structure  function  derived  below  and  a  numerical  evaluation 


of  the  structure  function  in  (2.81)  is  presented  in  Figure  2.4  Removing  the  exponential 


term,  the  remaining  integral  can  be  solved  using  a  table  from  Gradshteyn  m- 


(Ik 


Ju(bn)K 


v+l 


av~^ 


(k2  +  a2)^1  2^r(/u  +  l) 


Kv-Jab). 


(2.86) 


Substituting  p,  =  |,  v  =  0,  b  =  R,  and  a  =  kq  yields  the  following  result: 


d,K 


(«2  +  «2)11/6 


6  “  ^ 
Jo(kR)  =  YR*  K_5  (kqR)  = 


2®r(^ 


11\  ~e 
6 


r( 


(2.87) 


Note  that  Kx  represents  a  modified  Bessel  function  of  the  second  kind.  Combining  the 


results  in  (2.85)  and  (2.87)  produces  the  familiar  closed  form  expression  for  Dp  (R): 


DpAR)  =  0.4898r0  5/347t 


3  -f  _  (2k°) 

_  rvc\ 


n 


-K 5  (kqR) 


(2.88) 


These  results  provide  both  spatial  (2.88)  and  spatial  frequency  (2.78)  statistical  mod¬ 
els  for  the  atmospheric  turbulence  phase.  These  two  very  important  expressions  were 
carefully  derived  from  first  principles.  The  phase  spectrum  and  phase  structure  function 
will  prove  vital  when  creating  a  computer-based  atmospheric  simulation  with  which  to  test 
the  wavefront  sensor. 


2.3  Defining  the  Parameter  Space 

Given  a  statistical  model  for  the  wavefront  phase,  the  next  step  is  to  relate  the  model 
to  the  Zernike  polynomial  expansion  introduced  earlier.  In  doing  so,  it  is  possible  to 
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35  R 


-2-1012 

10  10  10  10  10 


Figure  2.4 


Top:  the  numeric  integral  structure  function  [solid  line \  in  (2.81)  and  the  an¬ 
alytical  form  [dashed  line \  in  (2.88),  [ro  =  0.088m,  Iq  =  O.Olrn,  Lq  =  10m]. 
Bottom:  percent  difference  between  the  numeric  integral  and  analytic  struc¬ 
ture  functions. 


combine  the  ideas  from  the  parameter  estimation  section  with  the  modal  statistics  from  the 
atmospheric  model.  Recall  that  deriving  a  probabilistic  mapping  from  the  image  intensity 
to  the  set  of  estimated  lower  order  Zernike  modes,  is  the  essence  of  the  parameter  estimator. 
Chapter  [l]  introduced  the  fact  that  the  phase  aberration  function  can  be  approximated  by 
a  linear  combination  of  Zernike  polynomials: 

N 

P<f>( r)  ~  WP{ r;  RP)  ^  (r;  RP) .  (2.89) 

i= 2 
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where  Rp 

= 

radius  of  the  aperture, 

r 

= 

(a  0) , 

0 

< 

r  <  oo, 

0 

< 

9  <  2v r, 

Zi  (r;  Rp) 

= 

z*(i4 

The  function  Wp .  is  a  pupil  windowing  function: 

WP(r;RP)  =  \1n  (2.90) 

[0,  r  >  Rp 

Note  that  the  two-dimensional  coordinates  (r,  9)  have  been  condensed  into  a  single  vector 
representation,  r,  whenever  convenient.  The  pupil  or  aperture  radius,  Rp,  may  be  included 
in  Zernike  functions  and  windowing  functions  to  give  the  notation  more  generality.  Given 
some  measurement  of  the  field  phase,  the  only  unknowns  are  the  at  s.  Assuming  that  it  is 
possible  to  develop  a  method  for  transforming  intensity,  which  can  be  measured  directly,  into 
field  phase,  then  the  estimator  must  guess  at  the  values  for  the  a^s.  Within  the  atmospheric 
statistics  there  must  be  an  average  value  and  a  variance  for  each  of  the  estimated  coefficients. 
Given  an  average  value  and  a  range  of  say  ±x  standard  deviations  the  estimator  could  define 
a  most  likely  starting  range  for  each  coefficient  much  like  the  assumption  required  for  the 
ML  estimator.  Further,  the  MAP  estimator  could  be  fashioned  from  an  average  value  and 
an  accurate  variance  for  each  parameter.  The  set  of  expected  values  and  variances  of  each 
parameter  will  be  referred  to  as  the  parameter  space.  This  section  will  discuss  a  few  basics 
concerning  the  Zernike  basis  set  and  the  theory  which  links  the  Zernike  basis  to  a  particular 
atmospheric  phase  spectrum. 

The  Zernike  background  begins  with  a  demonstration  of  how  to  construct  each  of  the 
Zernike  modes,  calculate  Zernike  coefficients  from  a  given  phasefront  measurement,  and 
how  to  calculate  the  mean  and  variance  of  each  Zernike  mode.  This  treatment  follows 
Roggemann  [3j  with  the  exception  of  some  minor  notation  changes.  The  coefficients,  at, 
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can  be  found  by  projecting  each  Zernike  onto  the  wavefront  phase: 


cii  =  J  dpWz{p)Zi(p-1 

1  )Pip  (pRp,9) , 

(2.91) 

>  -  (i4 1 

(2.92) 

mr>  =  { ;; 

|P|<1 

IpI  >  1  ' 

(2.93) 

Note  the  use  of  the  scaled  coordinates,  p,  is  required  because  the  Zernike  polynomials  are 
only  valid  on  the  unit  circle.  The  weighted  windowing  function,  Wz(p- 1),  provides  the 
limits  of  integration  and  the  appropriate  scaling  such  that  the  Zernikes  are  orthonormal  on 
the  unit  circle.  Recall  that  Zernike  polynomials  are  defined  as  an  orthonormal  basis  strictly 
over  the  unit  circle: 

J  dpWz(p)Zi(p-  1)  =  0,  (2.94) 

for  i  >  2,  and 


dpWz{p)Zi(p-l)Zi,(p-l)  =  6  hi, 


3 ii! 


0, 

1,  i  =  V 


(2.95) 

(2.96) 


and  hence,  the  need  for  the  scaled  windowing  function  and  radial  coordinate  scaling.  Re¬ 
call  that  the  wavefront  sensor  will  estimate  modes  beginning  with  Z2.  For  this  reason, 
the  phase  function  of  interest  is  the  piston  removed,  or  zero  mean,  phase  aberration,  P^. 
Consequently,  each  coefficient,  i  >  2,  has  zero  mean: 

E{ai)  =  E^j  dpWz(p)Zl(p)P<p(pRP,0)Y  (2.97) 

=  j  dpWz{p)Zi  (p)  E{P t  ( pRP ,  9)},  (2.98) 

=  0.  (2.99) 
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The  radius  argument  in  Wz  and  Zj  is  removed  to  compact  the  notation  for  the  case:  Rp  =  1. 
The  coefficient  variance  is  another  important  statistical  moment: 


/f{  did/' } 


J  dpWz(p)Zi  (p)  P,/)  (pRp,  0)  x 

j  dp/Wz{p/)Zi,  (pf)  P<p  ( p’Rp ,  9’) 


, 


(2.100) 


J  dpWz(p)  J  dp'Wz{p')Zi{p)Zi,  (p7)  E{P^pRP,9)PlP{p'Rp,e ')}, 


J  Varja*},  i  =  i' 
\Cov{aiaj/},  i  /  i' 


(2.101) 

(2.102) 


The  amount  of  power  or  phase  error  associated  with  any  coefficient  is  related  to  the  coeffi¬ 
cient  variance.  Given  a  plane  wave  reference,  the  aperture  mean  square  error  will  be  shown 


to  be  (see  Section  6.1): 


r.  OO 

(pl)=  /  dpWz(p)E{P$(pRP,9)}=Y^E{a*}.  (2.103) 

•’  i= 2 


The  mean  square  phase  aberration  is  commonly  used  as  a  measure  of  the  atmospheric  dis¬ 
tortion  present  in  an  optics  system.  The  mean  square  measure  of  distortion  equates  to  the 
sum  of  the  variances  in  each  Zernike  coefficient.  This  relationship  makes  the  modal  expan¬ 
sion  extremely  useful  in  identifying  the  amount  of  phase  error  expected  of  the  atmospheric 
model  and  how  that  error  is  distributed  among  the  modes. 


Section  1.4  in  the  Introduction,  identified  the  first  6  Zernike  polynomials.  The  set  of 
rules  below  provide  a  means  to  derive  any  Zernike  polynomial  [7j: 


Zieven(r,9)  =  v /2(n  +  l)R™(r)  cos (mO) 

}  m 

Zi  odd(r,  9)  =  y /2(n  +  l)iC (r)  sin(m0) 

Zi{r,6)  =  v^Tl  R°n, 


+  o, 


(n—m)/ 2 


jcw=  E 


(— 1  )s(n  —  s)! 


m  =  0, 


n—2s 


(2.104) 

(2.105) 

(2.106) 


^  s!(^_s)!(^_s)! 

There  is  more  than  one  ordering  scheme  for  Zernike  polynomials.  To  minimize  confusion,  I 
will  quickly  outline  two  common  ordering  schemes.  Ordering  Zernike  polynomials  requires 
at  least  two  indices  due  to  the  two  degrees  of  freedom  r  (radial)  and  9  (azimuthal).  Most 
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function  [n,  m\  =  index2nm(i) 
x  =  1; 
xsum  =  1; 
while  xsum  <  i 
x  =  x  +  1; 
xsum  =  xsum  +  x; 
end 

n  =  x  -  1; 

m  =  (i  -  xsum  +  x); 
if  n/2  ==  round (n/ 2) 
if  ra/2  /  round(?ra/2) 
m  =  m  -  1; 

end 

else 

if  m/2  ==  round(m/2) 
m  =  m  -  1; 

end 

end 

return 


Table  2.3  Example  algorithm  designed  to  generate  n  and  m  from  i  using  Noll’s  Zernike 
ordering  scheme. 


ordering  schemes  use  a  single  index,  i ,  to  sort  the  polynomials  and  two  other  indices,  (n,  m), 
to  identify  radial  order  and  azimuthal  order.  Two  common  ordering  methods  are  provided 
by  Malacara  m  and  Noll  [7].  Each  ordering  scheme  has  its  respective  benefits.  For 
instance,  Malacara’s  ordering  offers  a  simple  relationship  between  the  primary  index  i  and 
the  dual  indices  n  and  m : 


— 3  T  \/l  +  8 i 

n  =  next  integer  greater  than  - — - 

n{n  +  1)  +  2 


(2.107) 

(2.108) 


This  paper  will  adopt  Noll’s  ordering  scheme.  Noll’s  ordering  places  an  odd  or  even  depen¬ 
dency  between  the  overall  index  i  and  the  symmetry  of  each  polynomial.  This  is  extremely 
beneficial  to  the  derivation  to  follow  in  that  it  allows  for  a  very  compact  solution  to  the 
coefficient  variance.  Unfortunately,  Noll’s  scheme  does  not  share  the  benefit  of  having  a 
simple  relationship  between  the  primary  and  dual  indices.  Instead  an  algorithm  similar  to 


the  one  shown  in  Table  2.3  is  required  to  generate  n  and  m  from  i : 
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Piston:  (1,0,0) 


Defocus:  (4, 2,  0) 


Tilt-®:  (2,1,1) 


✓ 

0 


Astigmatism-xy:  (5,  2,  2) 


Tilt-y:  (3, 1, 1) 


Astigmatism:  (6,2,2) 


Trefoil:  (9, 3,  3) 


Trefoil:  (10,3,3)  Spherical:  (11,4,0) 

Table  2.4  The  first  11  Zernike  polynomials  and  their  corresponding  i ,  n,  and  m  Noll 
ordering. 


Given  the  expressions  for  the  Zernike  polynomials  above  and  Noll’s  ordering  scheme, 
it  is  possible  to  unambiguously  describe  each  polynomial  and  single  out  its  effects  on  the 
wavefront.  Table  [274]  contains  example  phase  plots  and  demonstrates  the  indexing  for  the 
first  eleven  Zernikes.  Each  Zernike  image  in  Table [274] is  labeled  according  to  the  convention, 
Zernike  name:  (z,n,  m). 

This  section  began  with  the  purpose  of  deriving  a  relationship  between  the  statistical 
turbulence  model  and  the  modal  decomposition  coefficients.  Demonstrating  how  the  at¬ 
mospheric  statistics  derived  in  the  previous  section  relate  to  the  Zernike  coefficients  requires 
a  frequency  domain  representation  of  the  Zernike  modes.  Let  Qi (k, '0)  be  the  frequency 
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domain  representation  of  Zi(r,0).  Born  provides  the  following  transform  pair  pUj: 


WP{ r;  l)Zi(r) 
where  K 

K 

and  ip 


j  d2K.Qi(n,  ip)  exp{— jK  •  r}, 
(■ KX,Ky ), 


K2  +  Ky . 
atan2  (Ky,  Kx) , 


(2.109) 

(2.110) 
(2.111) 
(2.112) 


where  the  atan2  function  is  a  call  to  the  Matlal)  function  arctan  which  allows  arguments 
in  the  range  [0,  2n).  The  frequency  domain  Zernike  functions,  Qi,  are  given  by: 

Qcven  VO  =  \  r—. -2Ju+1(k)  (  (-l)(n~m)/2jmV2  cos  mip 

>  yn  +  1 - <  ,  lor  ra  f  0 

Qoddi(n,iP)  =  J  K  {  (— l)(ri-m)/2jmv/2 sin mip 

Qi(n,ip)  =  Vn  +  1 — n+1^  ^(— l)n/2,  for  m  =  0.  (2.113) 

Note  that  Jv  represents  the  Bessel  function  of  the  first  kind  with  order  v.  With  this 


property  in  hand,  return  to  the  expression  for  Zernike  coefficient  variance  (2.101): 


E{a*ai,}=  [  dpWz(p)  f  dp'Wz{p')Zi(p)Zi,  {p')  E{P^  (pRP,6)  (p' RPl0')}-  (2.114) 


Making  use  of  (2.113)  and  (2.78)  it  is  possible  to  rewrite  this  expression  in  the  frequency 
domain: 

E{a*ai'}  =  J  dip  J  dip'  j  J  V)Qi'V,ip')- 

(2.115) 


Enforcing  the  condition  that  &P(f>  is  delta  correlated,  the  expression  for  E{a*ai>}  can  be 
reduced  to  two  integrals: 


E{a*ai,} 


E  {a*a-i'} 


1  K 


1  k' 


dip  /  dip'  /  ————dn  /  ————dn'  x 


Rp  Rp 


Rp  Rp 


j^,V)Qi'(V,ip')d(n  -  V)5(ip  -  ip'),  (2.116) 


dip  J  ^r^dKQ*(K,1p)^PVj^,1p)Qi'(K,1p). 


(2.117) 
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Now  recall  that  <hp^  is  symmetrical  in  ip.  Removing  the  dependence  of  on  ip  and 
reordering  the  integrals  yields: 


E{a*ai' } 


1  K 

Rp  Rp 


dK^p^ 


dipQ*(K,ip)Qi'(K,ip). 


(2.118) 


Substituting  for  nr  in  (2.78),  4^  is  given  by: 


<l'P, 


= 


-5/3 


\RpJ 


=  0.4898r0  '  (  +  Ko 


K 


-11/6 


exp 


— n 

RpKm 


(2.119) 


This  expression  needs  a  few  points  of  discussion.  The  ratio,  should  be  factored  sep¬ 
arately  within  the  equation  as  it  will  allow  introducing  the  ratio  Recalling  the  brief 

introduction  to  ro  should  clarify  the  importance  of  the  ratio.  Also,  for  mathematical 
tractability,  a  separate  expression  for  <f>p^  neglecting  the  effects  of  inner  and  outer  scale 
can  be  derived.  The  cost  of  this  assumption  is  that  the  results  derived  from  this  form  of 
the  expression  are  only  valid  in  the  inertial  range.  High  frequency  modes  with  periods 
smaller  than  the  inner  scale  will  suffer  variance  overage  errors  due  to  improper  inner  scale 
roll  off.  Likewise,  low  frequency  modes  with  periods  longer  than  the  outer  scale  will  have 
high  variance  estimates.  With  these  cautions  in  mind,  I  offer  the  following  two  forms  of 
<bp^:  the  first  with  both  inner  and  outer  scale, 


$PK 

^  \RP  RP  ^  ^ 


5/3 


=  0.4898  I  (k )  Rp5^3  (A  + 


2  \  —11/6 

9  \  /  r\j 

'  exp 


K>2  k2 


5/3 

=  0.4898  (  —  )  R2p  (k2  +  R2pk20)~11/6 


Rp\ 
ro  ) 


exp 


—  K 

Rp  Km 


(2.120) 


(2.121) 


and  the  second  (Kolmogorov  turbulence)  with  inner  and  outer  scale  terms  removed, 

5/3 


^(i4^')=a489V„/ 


(2.122) 


The  spectrum  containing  both  inner  scale  and  outer  scale  compensation  will  be  reserved  for 
numerical  results  to  compare  with  the  analytical  solution.  Proceeding  to  form  an  analytical 


result  to  the  correlation  integral,  substitute  the  simplified  phase  spectrum  into  (2.118)  and 
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make  use  of  the  frequency  domain  Zernike  expression  in  (2.113): 


E{a*a,i'}  =  J  j^kcIkOA898  ^  R^k  n/3  J  dipQ* (/c,  ip)Qi'(n,  V0>  (2.123) 

8^3dK,  J  Q*(k,  ^)Qi'(K,  ip)  dtp.  (2.124) 


=  0.4898  (  j  k 


Substituting  (2.113)  for  Q  yields: 


//?  \5/3 

E{a*ai,}  =  2vr  •  0.4898  y/(n  +  l)(n'  +  l)(-l)(n+n'"2m)/25mm/  x 

f  8/3  2Jn+l(^)  2<7n'-)-l(/t) 


-dw, 


(2.125) 


0.4898  •  24/3tt  (^)5/3  V7 (n  +  l)(n'  +  1)(— l)(n+n'-2m)/2<W  x 

J  d^K-14/3  Jn+i(«:)  Jn/+i(«;),  z  —  z' even  (2.126) 

0,  i-i'  odd.  (2.127) 


It  is  important  to  highlight  a  few  simplifications  required  for  the  form  in  (2.126).  The 


f2R  \  5/3 

extra  factor  of  2  in  (  — -  )  was  added  for 


\  r°  J 


convenience.  The  expression  now  relates 


Zernike  coefficient  covariance  to  the  ratio  of  the  pupil  diameter ,  Dp ,  to  rg.  The  ratio  — 


r  o 


is  dominates  the  Zernike  variance  expression  within  the  inertial  range.  For  this  reason,  the 
ratio  will  be  used  as  an  indicator  of  turbulence  strength  throughout  the  remainder  of  this 
document.  Also,  several  simplifications  reducing  the  integral  over  ip  are  made  possible  by 
the  symmetry  in  the  Zernike  modes  combined  with  Noll’s  ordering  scheme.  First,  if  i  —  i! 
is  odd  then  the  integral  over  y b: 


2-k 


cos  (mip)  sin (m1  ip)  dip, 


(2.128) 


will  integrate  to  zero,  given  m,  ml  E  Z.  By  similar  reasoning,  only  those  covariances  where 
m  =  m!  are  nonzero,  because: 


if  m,  z  E  Z, 

2-rr 

then  j  cos(mip)  cos((m  +  z)ip)dip  =  0. 
o 


(2.129) 

(2.130) 
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Hence,  the  5mrni  term.  The  factor  of  2-k  leading  the  expression  in  (2.125)  is  due  to  nonzero 
cases  for  the  integral  over  ip.  Each  nonzero  case  for  the  integral  over  ip  results  in  a  factor 


of  27t: 


J  2  cos 2{mip)dip  =  2-7T,  (2.131) 

o 

2k 

J  2  sin 2{mip)dip  =  2tt,  (2.132) 

o 

2k 

J  dip  =  2t r.  (2.133) 

o 


The  term,  (— l)(^l+^!',  2m)/2,  results  from  the  following  cases: 

_ ^(n— m)/2jm^  ^ _ ^(ra7— m)/2jm  _ 

(-1  )(n-H/2(_1)(n/-m)/2j  for  m  even,  (2.134) 

and  ^_l^n-m)/2-my  (_1)(n'-m)/2jm  = 

-j(j)(-l)(n“m)/2(-l)(n,“m)/2,  form  odd.  (2.135) 


The  remaining  integral  can  be  solved  via  table  m-  The  form  is  as  follows: 


OO 

J  Ju(at)Jfl(at)t~xdt 

o 


a 


A— 1 


r(A)T 


1S~\~PL — A  “hi 
2 


2AT  ^-u+fi+X+y  p  p  ^ 


is — yix— |—  1 
2 


(2.136) 


Letting  z/  =  n  +  l,/r  =  ?z/  +  l,  A  =  y ,  and  a  =  1,  the  final  simplified  solution  to  the 
covariance  is: 


E{a*a,}  (  -y- 


'  =  0.4898  •  24/3tt y/(n  +  1  )(n'  +  1)(-1  ){n+n'-2m)/2Smm'  x 


nw 


n+n'  —  | 


214/3T 


n'— n+- 


p  /  n+n'+  3 


(2.137) 


This  result  is  convenient  for  generating  entries  in  the  covariance  matrix  for  the  Zernike 

/  \  —5/3 

provides  the  scaled  covariance  results,  E{a*ai>}  (  -y  j  ,  for  the 


coefficients.  Table 
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2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

2 

0.449 

0 

0 

0 

0 

0 

-0.0141 

0 

0 

0 

3 

0 

0.449 

0 

0 

0 

-0.0141 

0 

0 

0 

0 

4 

0 

0 

0.0232 

0 

0 

0 

0 

0 

0 

0 

5 

0 

0 

0 

0.0232 

0 

0 

0 

0 

0 

0 

6 

0 

0 

0 

0 

0.0232 

0 

0 

0 

0 

0 

7 

0 

-0.0141 

0 

0 

0 

0.00619 

0 

0 

0 

0 

8 

-0.0141 

0 

0 

0 

0 

0 

0.00619 

0 

0 

0 

9 

0 

0 

0 

0 

0 

0 

0 

0.00619 

0 

0 

10 

0 

0 

0 

0 

0 

0 

0 

0 

0.00619 

0 

11 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.00245 

Table  2.5  Normalized  Zernike  coefficient  covariance:  Lq  =  oo,  /q  =  0. 


first  11  Zernike  coefficients  neglecting  piston.  This  table  reflects  the  results  from  Noll  for 
the  case  of  infinite  outer  scale  and  inner  scale  equal  to  0.  There  is  no  analytical  result  to 
the  expression  for  E{a*a,i'}  that  includes  the  effects  of  von  Karman  inner  and  outer  scale, 
but  it  is  easy  enough  to  approximate  the  integration  numerically.  Forming  the  covariance 
integral  with  the  von  Karman  turbulence  spectrum  yields: 


E  {a*di'}  =  0.4898  •  24/3tt  ^  y/(n  +  l)(n'  +  1)(-1  )(n+n'-2m)/2<5w 


dn 


Jn+l  (k)  Jn'-\-l  (^) 
k(k2  +  R^Kq)11/6 
=  0,  i  —  i'  odd. 


exp 


—  K 

R2  K 2 
Xlphm 


i  —  i  even 


(2.138) 

(2.139) 


Tables 


and 


demonstrate  the  results  on  lower  order  Zernike  variance  due  to  finite 


outer  scale  and  inner  scale  greater  than  zero.  Table  2.6  shows  the  effects  of  varying  outer 
scale  on  the  first  11  Zernike  modes  when  inner  scale  is  0.  Table  o  shows  the  effects  of 
varying  of  inner  scale  when  outer  scale  is  infinite.  These  results  show  that  the  analytical 
covariance  result  serves  as  a  guideline  only  and  the  numerical  covariance  should  be  used  as 
the  ideal  reference  in  situations  where  the  outer  scale  and  inner  scale  are  known  or  where 
they  may  be  estimated. 

This  section  provided  a  modal  decomposition  for  the  wavefront  using  the  Zernike 


polynomial  basis  set.  Expressions  (2.137)  and  (2.138)  demonstrate  that,  given  the  Fried 
parameter,  the  aperture  diameter,  and  the  inertial  range,  one  can  construct  a  model  for 
the  variance  of  the  Zernike  modes.  The  variance  of  each  Zernike  mode  is  directly  related 


2-30 


Lq  [nr] 

10° 

101 

102 

1CF 

104 

105 

106 

107 

-^2,3 

0.171 

0.317 

0.388 

0.421 

0.436 

0.443 

0.446 

0.448 

Za-o 

0.0220 

0.0232 

0.0232 

0.0232 

0.0232 

0.0232 

0.0232 

0.0232 

N 

-d 

1 

O 

0.00610 

0.00619 

0.00619 

0.00619 

0.00619 

0.00619 

0.00619 

0.00619 

^11 

0.00244 

0.00245 

0.00245 

0.00245 

0.00245 

0.00245 

0.00245 

0.00245 

Iq  =  0,  ro  =  0.088m,  Dp  =  0.088m 


Table  2.6  Zernike  variance  versus  Lq. 


l0  [nr] 

ro/50 

r-o/10 

ro/ 5 

ro/3 

r0/2 

^2,3 

0.449 

0.448 

0.447 

0.444 

0.438 

Za-o 

0.0232 

0.0230 

0.0224 

0.0211 

0.0190 

o 

1 

N 

0.00619 

0.00607 

0.00573 

0.00504 

0.00399 

Z\\ 

0.00245 

0.00237 

0.00215 

0.00173 

0.00117 

Lq  =  oo,  ro  =  0.088m,  Dp  =  0.088nr 


Table  2.7  Zernike  variance  versus  Iq. 


to  the  power  spectral  density  of  the  random  atmospheric  phase  distortion.  Clearly  this 
statistical  model  aids  in  the  construction  of  the  wavefront  sensor  described  earlier.  The 
parameter  space  for  each  coefficient  theoretically  maps  to  the  entire  real  line  because  each 
one  is  modeled  as  a  zero  mean  Gaussian  random  variable.  Given  the  variance  however,  the 
sensor  algorithm  may  choose  to  truncate  the  parameter  space  to  a  smaller  portion  of  the  real 
line  containing  the  bulk  of  the  probability  mass.  This  sets  the  range  of  possible  estimates 
for  a  maximum  likelihood  estimation  process.  Further,  the  Gaussian  prior  density  may  be 
used  to  form  a  maximum  a  posteriori  estimator.  With  the  parameter  variance  calculations 
in  hand,  all  that  remains  is  to  offer  some  method  for  discerning  the  incoming  wavefront 
from  intensity  measurements.  This  leads  to  the  final  background  topic:  a  brief  review  of 
the  linear  systems  optics  model  and  the  concept  of  an  optical  transfer  function. 


2-4  The  Optical  Transfer  Function  (OTF) 

Simulating  optical  wave  propagation  and  the  wavefront  sensor  environment  using  the 
mathematical  convenience  of  linear  systems  theory  will  require  applying  a  few  assumptions 
to  electromagnetic  wave  theory.  It  would  be  quite  beneficial,  for  instance,  to  be  able  to 
apply  the  superposition  and  convolution  properties  that  apply  in  the  realm  of  linear  systems 
theory  to  the  generic  optical  system.  Maxwell’s  equations  describe  the  physical  properties 
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of  electromagnetic  waves.  The  set  of  assumptions  for  wave  propagation  using  linear  systems 
begins  with  those  assumptions  necessary  to  derive  a  scalar  wave  equation.  The  scalar  wave 
equation  is  then  combined  with  diffraction  theory  to  create  an  integral  formulation  of  the 
optical  field  at  some  aperture  location.  Simplifying  the  integral  formulation  leads  to  a 
duality  between  the  field  at  one  aperture  location  and  its  Fourier  domain  representation  at 
some  originating  aperture.  Thus,  under  reasonable  assumptions,  an  optical  system  image 
plane  can  be  considered  a  convolution  of  the  geometric  image  and  the  diffraction  pattern 
created  by  the  aperture.  The  series  of  important  results  to  follow  are  by  no  means  a 
thorough  treatment  of  Fourier  optics,  but  should  provide  enough  highlights  to  reinforce  the 
concepts  that  will  be  necessary  to  simulate  optical  propagation  through  the  atmosphere  and 
the  interaction  with  the  wavefront  sensor.  The  following  derivations  are  summary  of  the 
results  presented  in  the  popular  works  of  Goodman  m  and  Born  |19j. 

Beginning  with  Maxwell’s  equations,  assume  that  the  medium  is  linear,  isotropic, 
homogeneous,  nondispersive  and  nonmagnetic.  Linearity  in  the  medium  may  be  explained 
by  describing  the  medium  as  a  system  with  complex  fields  as  inputs  and  outputs.  The 
property  of  linearity  applies  to  the  system,  /,  if  the  following  superposition  holds  for  all 
functions  u\  and  u2  and  all  complex  constants  a  and  b: 


f  { au i  ( P )  +  bu2  (P)}  =  af  {ui  (P)}  +  bf  {u2  (P)}  • 


(2.140) 


Under  this  assumption,  the  resulting  field  propagating  from  the  sum  of  two  scaled  source 
fields  is  equivalent  to  summing  the  scaled  results  of  the  two  source  fields  propagated  in¬ 


dependently  through  the  medium.  Isotropic  indicates  that  the  propagation  is  independent 


of  direction  of  polarization  of  the  field.  Homogeneity  indicates  that  the  permittivity,  e, 
is  constant.  The  term  nondispersive  means  that  the  permittivity,  is  not  a  function  of 
wavelength.  Lastly,  the  medium  is  assumed  to  be  nonmagnetic  meaning  that  the  medium 
has  vacuum  permeability,  /i  =  /i0.  Under  these  assumptions,  the  solution  to  Maxwell’s 
equations  reduces  to  the  scalar  wave  equation: 


(2.141) 
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where  n  =  yl^  is  the  index  of  refraction,  c  =  ^==is  the  vacuum  speed  of  light,  P  indicates 
spatial  location,  u  is  any  component  of  the  vector  fields:  £,  or  7i,  and  t  is  time.  Representing 
the  field  u  as  the  real  part  of  a  complex  phasor  U  gives: 


U(P) 

=  A(P)  exp{— j^(P)}; 

(2.142) 

u(P ,  t) 

=  R  e{U(P)}, 

(2.143) 

=  A(P)  COs{27T^t  +  4>(P)}. 

(2.144) 

Forcing  the  field  to  satisfy  the  scalar  wave  equation  produces  the  familiar  time  independent 


Helmholtz  equation  as  follows: 

2  o  2 

V2u(P,  t)  -  ^A(P)-^  [cos{27tz4  T  </>(P)}] 

=  o, 

(2.145) 

n2 

V2u(P,  t)  +  (27t)2  v2  —^A(P)  cos{27r^t  +  4>(P)} 
cz 

=  o, 

(2.146) 

(V2  +  k2)u 

=  0. 

(2.147) 

The  integral  theorem  of  Helmholtz  and  Kirchoff  can  be  developed  (see  Goodman  chapter 
3)  from  the  divergence  theorem  of  Gauss  [211: 


(GV2G  -  GV2G)  dv  =  JJ  ((h  •  W)  G-U  (h  •  VG))  ds,  (2.148) 

V  s 


and  the  Helmholtz  equation,  (2.147),  which  provides  a  relationship  between  the  field  at  a 


point  and  the  closed  surface  around  the  point 


U(P0)  =  JJ  ((n  ■  W)  G-U(n ■  VG))  ds.  (2.149) 

s 

Where  the  surface  S  is  some  surface  surrounding  the  point  Pq,  and  the  expression  n  ■  V(-) 
is  equivalent  to  the  derivative  taken  normal  to  S.  This  crucial  relation  is  the  key  to  wave 
optical  simulation.  It  provides  an  integral  formulation  for  the  field  at  some  boundary  or 
aperture  at  a  distance  from  a  known  source.  The  choice  of  Green’s  function,  G,  is  critical. 
Rayleigh  and  Sommerfeld  are  attributed  with  the  formulation  in  Figure  [2~o]  which  suggests 


2-33 


the  Half  Space  Green’s  function  of  the  form: 


G(Pi) 


exp(jfcrpi)  _  exp(jfcf0i) 
An  for 


(2.150) 


This  choice  of  G  represents  the  linear  combination  of  fields  from  two  sources  at  P  and  P 


I 


Figure  2.5  Rayleigh-Sommerfeld  formulation  of  diffraction  by  a  plane  screen  m- 


oscillating  180  out  of  phase.  Notice  the  change  in  integration  limits  in  Figure [275|  According 
to  the  diagram,  the  integral  must  now  be  evaluated  over  Si,  S2  and  S2.  Sommerfeld 
simplified  the  limits  of  integration  by  assuming  that  the  field  U  vanishes  at  least  as  fast  as  a 
diverging  spherical  wave.  This  assumption,  known  as  the  Sommerfeld  radiation  condition, 
reduces  the  integral  over  all  of  the  dashed  surface:  S  =  |^J  jSi,  S2,  *§2  j,  to  an  integral  over 
the  plane  of  the  aperture,  Si: 

lim  ( R^-  —  jkU) 

00  V  dn  J 

U(Po) 

Si 


=  0,  (2.151) 

=  -7-  [ [  ((n  ■  VC/)  G  -  U  (h-  VG))  ds.  (2.152) 
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Substituting  G  from  (2.150)  into  the  integral  theorem  of  Helmholtz  and  Kirchoff,  (2.149) 


and  assuming  that  the  Sommerfeld  radiation  condition  holds,  yields  the  following  integral 
over  the  plane  of  the  screen: 


U(P0) 


rJlu{Pi) 

Si 


exp(jfcroi) 


r  01 


cos(n,  roi)ds, 


(2.153) 


called  the  Rayleigh- Sommerfeld  diffraction  formula.  Note  that  the  vector  h  is  the  direction 
normal  to  the  aperture  and  the  cosine  expression  with  two  vector  arguments  is  shorthand  for 
the  cosine  of  the  angle  between  the  two  argument  vectors.  Using  the  Rayleigh-Sommerfeld 
diffraction  formula  for  optical  propagation,  the  simple  thin  lens  imaging  system  may  be 
conveniently  modeled  as  two  propagations:  one  from  the  object  to  the  aperture  plane  and 
one  from  the  aperture  plane  to  the  image  plane.  This  model  will  become  the  basis  for 


discussions  to  come,  and  as  such,  warrants  a  defined  coordinate  system.  Figure  2.6  portrays 


Object  Plane  Aperture  Plane  Image  Plane 


Figure  2.6  Three  plane  Cartesian  coordinate  system. 


the  basic  three  plane  imaging  system  model  and  respective  Cartesian  coordinate  system 
labeling.  Assuming  a  spherical  wavefront  and  converting  to  Cartesian  coordinates  the 


propagation  integral  in  (2.153)  may  be  rewritten  as: 


U($,v) 


where  roi 


£i  If  ip±ixiv, 

J  A  r01 

E 

+  (x  -  £)2  +  (y  -  ??)2, 


(2.154) 

(2.155) 
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and  the  cosine  of  the  normal  angle  has  been  replaced  by  the  small  angle  approximation: 


cos(h, r0i) 


Si 

roi’ 


(2.156) 


The  linear  systems  approach  will  require  an  approximation  for  the  square  root  in  the  roi 
terms.  The  radial  distance  in  the  exponential  phase  term,  exp(jfcroi),  and  in  the  denomi¬ 
nator,  *2  may  each  be  replaced  by  a  truncated  binomial  series  expansion.  The  binomial 

roi 

expansion  for  roi  is  shown  here: 


roi  = 

si 

1  +  - 

Vl  +  b  = 

i+ 

¥- 

roi  = 

Si  ( 

f 

i  + 

.  / 

+ 


i  ( Q-£)2  I  {y-nf 


+ 


SL 

\2  \  2 


+ ... 


Retaining  the  first  term  yields  the  approximation: 


(2.157) 

(2.158) 

(2.159) 


roi  ~  Si. 


(2.160) 


Substituting  this  approximation  for  roi  into  the  denominator  term  gives: 

1  _  1 

^2”  ~  72' 

r01  si 

Retaining  the  first  two  terms  yields  the  approximation: 

roi  ~  ^  ^  ((x  -  O2  +  {y~  vf)  ■ 

Substituting  this  approximation  for  roi  into  the  phase  term  gives: 
exp(jfcroi)  « 


exp 


(jfc  s*  +  27  (( x  -  £)2  +  (v  ~ ??)2)  )  » 


jk 


exp  (jksi)  exp  I  —  (z  -  £)  exp  —  (y  -  rj) 


2s. 


]k 


2s  j 


(2.161) 


(2.162) 


(2.163) 

(2.164) 


2-36 


Making  both  substitutions  in  (2.154)  and  simplifying  produces  the  following  two  familiar 
forms  of  the  Fresnel  diffraction  integral: 


J  As;  [  2 Si  J 

oo 

U(x,y)exph  —  (x2  +  y2)  |  exp  j  -j  —  (£x  +  rjy)  \  dxdy ,  (2.165) 


y  jj  U(x,  y)  exp  (£  ~  x)2  +  (rj  -  y)2]  }  dxdy.  (2.166) 


Under  certain  conditions,  the  quadratic  phase  term  in  the  Fresnel  integral  can  be  removed. 
If  a  converging  spherical  lens  is  in  place,  for  instance,  then  the  quadratic  phase  is  exactly 
canceled  by  the  focusing  properties  of  the  lens.  Also,  in  cases  where  the  aperture  does  not 
contain  a  converging  lens  but  the  propagation  is  applied  over  a  sufficiently  long  distance,  the 
effects  of  the  quadratic  phase  over  some  small  aperture  becomes  negligible.  The  propagation 
distance  at  which  the  quadratic  phase  term  in  the  Fresnel  integral  becomes  negligible  is  often 
referred  to  as  the  far-held  condition: 


Si  A 


2  D2p 
A  ' 


(2.167) 


Assuming  that  far-held  conditions  hold,  the  integral  in  (2.166)  simplifies  to  the  Fraunhofer 
diffraction  integral: 


U(t,v)  = 


exp(j ksi)  exp  jj^^2  +  t?2)} 


jAs* 


U(x,y)ex p<J  -j ^-{^x  +  yy)  }dxdy.  (2.168) 


Under  these  circumstances,  the  solution  for  the  image  held,  U (£,  rj),  becomes  a  scaled  Fourier 
transform.  Using  this  theory,  a  single  spherical  lens  optical  system  can  be  modeled  as  a 
linear  hlter,  where  the  impulse  response  for  the  system  is  the  Fraunhofer  diffraction  integral 
of  the  system  pupil.  In  order  to  recognize  this  property  and  the  additional  assumptions 
required,  consider  hrst,  the  assumption  mentioned  previously,  that  the  medium  is  linear  and 
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therefore  the  following  superposition  holds  for  monochromatic  light: 


OO 


Ui&v) 

=  jj  h{^  —  a,rj  —  f3)U0(a,  /3)dad(3, 

(2.169) 

where  Ui 

—  OO 

=  field  in  the  image  plane, 

(2.170) 

U0 

=  field  in  the  object  plane, 

(2.171) 

and  h 

=  optical  system  impulse  response. 

(2.172) 

A  side  note  is  required  before  proceeding  with  the  derivation.  The  wavefront  sensor  is 
designed  to  operate  using  polychromatic  incoherent  light.  However,  the  case  is  made  by 
Goodman  that  the  solution  for  the  monochromatic  field  can  be  transformed  into  a  similar 
approach  for  polychromatic  incoherent  light  by  modeling  such  a  system  as  the  average  of 
contributions  from  many  incoherent  monochromatic  sources  [2D].  Therefore  the  case  will 
be  made  following  this  derivation  that  an  incoherent  system  is  linear,  not  in  the  field,  but 
rather  in  intensity  and,  as  such,  a  similar  convolution  integral  may  be  introduced  for  the 
case  of  incoherent  light.  Returning  now  to  the  monochromatic  case,  propagation  through  a 
single  lens  system  can  be  represented  as  a  convolution  of  the  object  field  with  some  impulse 
response,  h.  In  order  to  derive  the  impulse  response,  consider  the  response  of  the  system 


in  Figure  2.7  to  a  point  source.  The  paraxial  representation  of  a  spherical  wave  at  the 


U 


point  source 
input 


spherical  thin  lens 
focal  length,/ 


U 


(«,/)  °  (pc,y)  1  (C>/) 

Figure  2.7  Model  of  a  simple  thin  lens  imaging  system  illuminated  by  a  point  source. 
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aperture  plane  location  (x,  y)  emanating  from  the  object  plane  at  point  (a,  P )  is  given  by: 


(x  -  a)2  +  (y-  pf 


(2.173) 


The  lens  applies  a  quadratic  phase  and  an  aperture  window  function,  Wp: 


Wp{x,y,RP)ex p  j-j^  (x2  +  y2)  j  . 


(2.174) 


Combining  (2.173)  and  (2.174)  in  the  second  form  of  the  Fresnel  integral  in  (2.166),  the 


impulse  response  is  given  by  (neglect  constant  phase  terms): 


h(^,y,a,P)  = 


Wp(x,y;RP) 

XsiXsq 


exp  <  j 


2s0  L 


(x  -  a)2  +  (y  -  p  f 


exp  \  -j  Yj  (x2  +  y2) } exp  {■ j  py.  ^~xf  +  (??  -  vf 


dxdy. 

(2.175) 


Expanding  and  analyzing  similar 


terms  within  the  three  phase  components  yields: 


exp  <  j 


2s0 


(x  -  a)2  +  (y  -  P)2 


exp 


exp  <  -j 


'2/ 


2  s,;  L 


(e  -  + 


(x2  +  y2) 

(y  -  yf 


-j  k 


Pxa  +  yP\ 
\S0  So  J 


x 


exp{j^  (a2 +  /32)}  ,  (2.176) 

=  6XP  {  “j  2^  +  ^  }  6XP  {  “j  2^  +  y2">  }  ’ 

(2.177) 


-jfc 


/x^  +  yrA 
\Si  Si  J 


X 


(2.178) 


If  the  image  plane  is  placed  such  that:  4  =  —  +  -4,  then  the  quadratic  phase  in  x  and  y 
is  cancelled  by  the  quadratic  phase  contribution  of  the  lens.  Furthermore,  assuming  that 
the  quadratic  phase  in  a  and  P  is  nearly  zero  over  the  region  of  the  image  plane  effected 
by  the  point  source,  allows  that  quadratic  to  be  removed  as  well.  After  making  these 
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simplifications,  the  remaining  integral  is  given: 


^{i^ty  +  p2)} 

A  SiSo 


WP(x,  y,  RP)  exp  -j  k 


£  ,  a\  / 11  B 

—  +  —  x+  —  +  —  )y 
Si  So  /  V  Si  s, 


dxdy. 

(2.179) 


Now  define  the  transverse  magnification  to  be:  M  =  —  and  make  the  coordinate  changes: 
a  =  Ma ,  P  =  Mf3,  x  =  y  =  and  h  =  j^h: 


exp{jA  tt2  +  v2)} 


.27 T 


exp  <  -j 


So  '  “  '  A  SiX+ 


A  2SjS0 

Xsi\sidxdyWp(\siX,  Xsiy]  Rp)  x 
^  +  v 

A  +  (-S)  i)^y 

OO 

h  ($,  r/;  a,  ^  =  exp  |j^-  (£2  +  ij2)  j  JJ  WP{XsiX,  As;y;  i?P) 

—  OO 

1  {-j2vr  (£  -  a)  x  +  (77  -  /?)  y  |  dxdy. 


(2.180) 


exp 


(2.181) 


Making  the  appropriate  variable  change  from  a  and  /3  to  a  and  /3  in  (2.169)  h  becomes  h 


and  the  field  in  the  image  plane  of  a  converging  spherical  lens  system  can  be  represented 


as: 


Ui&v) 

where  d(£,  y) 


h{ £  —  d,  77  —  P)Ug(a,  P)dad/3, 


(2.182) 


OO 

exp  jj^  (£2 +  y2)  j  JJ  WP{XsiX,  Xsiy]  RP)  exp  {— j27r  (£x  +  yy)}  dxdy, 


M  =  transverse  magnification, 


and  Ug(a,  /?)  = 


1  u  (A  A) 

\M\  M  ' 


(2.183) 

(2.184) 

(2.185) 
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Thus,  for  the  monochromatic  case,  the  image  field  is  a  convolution  of  the  image  field  pre¬ 
dicted  by  geometric  optics,  Ug(a,$),  and  the  amplitude  impulse  response,  h(£,r]).  Where 


the  amplitude  impulse  response  of  the  optical  system,  given  in  (2.183)  above,  is  the  Fraun¬ 


hofer  diffraction  integral  applied  to  the  pupil  window,  Wp(x,y). 

If  the  input  source  is  polychromatic  incoherent  light,  modeled  as  the  average  of  many 
contributions  from  incoherent  monochromatic  sources  then  it  follows  that  the  incoherent 
imaging  system  is  linear  in  intensity  [20] .  Under  this  condition,  the  field  quantities  are 


replaced  by  field  intensities  and  the  amplitude  impulse  response  becomes  an  intensity  im- 

2 

pulse  response, 


h(£,  rj)  ,  which  is  the  magnitude  squared  Fraunhofer  diffraction  pattern. 


Therefore,  for  incoherent  light,  the  spatial  convolution  integral  is  given  by 


Ii(€,v)  =  k  JJ  h(£-a,rj-P)  Ig(a,  /3)dadf3,  (2.186) 

—  OO 

k  =  real  scaling  constant.  (2.187) 


The  dual  of  this  expression  in  the  spatial  frequency  domain  is: 


Qi{fx,  fy)  =  Ti-ifx,  fy)Gg(fx,  fy)-  (2.188) 

Where  Qg  and  Q%  are  the  normalized  frequency  domain  transforms  of  the  geometric  and 
diffraction  image  intensities,  and  7i  is  the  transform  of  the  impulse  response,  commonly 
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referred  to  as  the  Optical  Transfer  Function  (OTF): 


Ig  (£,  V)  exP  {-j2vr (fxC  +  fry)}  d£dy 


Gg(fx,fy )  = 


Gi{fx  ,/y)  = 


di(fx,  fy)  = 


Ig  (£,  y)  d£dy 
li  (£,  y)  exp{-j27 T(fx€  +  fry)}  d£dy 

5 

h  (£,  y)  d£dy 
\h  (£,  y)\2  exp  {— j27r(/.Y^  +  fYy)}dudv 
\h(£,y)\2d£dy 


(2.189) 


(2.190) 


(2.191) 


As  a  consequence  of  the  linear  systems  assumptions,  the  OTF  provides  a  relationship  be¬ 
tween  the  pupil  phase  and  the  image  intensity.  For  incoherent  imaging,  the  OTF  is  the 
normalized  autocorrelation  of  the  pupil  function: 


V(x  + 


,y  + 


I^)V(x  -  ,  y  -  I^p-)dxdy 


It(fx,fy )  = 


(2.192) 


V(x,  y)V*(x,  y)dxdy 


Combining  (2.192)  and  the  modal  wavefront  representation  in  (2.89),  gives  a  direct  method 


for  calculating  the  effects  of  any  combination  of  Zernike  modes  on  the  OTF.  Consider  the 
example  of  a  diffraction  limited  imaging  system  with  a  plane  wave  input.  Use  the  circular 


windowing  function  in  (2.90)  and  the  Zernike  expansion  for  phase  to  represent  the  pupil 
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expression.  In  this  special  case,  the  OTF  becomes: 


N 


i= 2 


V(x,y,RP)  =  Wp(x,y,RP)exp 


(2.193) 


H(fxjy)  =  IIV(x+^,y+^)V(x-^y-^)dxdy. 


(2.194) 


In  this  example,  the  normalized  transform  of  the  geometric  image,  Gi(fx,  fy),  is  unity. 
This  implies  that  the  OTF  and  the  image  are  direct  Fourier  transforms: 


7d(fx,  fy )  =  Gi(fx ,  fy)- 


(2.195) 


Figure  2.8  provides  a  visual  comparison  of  OTFs  for  this  example.  The  OTF  with  no 
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Figure  2.8  Top:  Simulated  point  spread  functions  for  a  diffraction  limited  optical  system 
and  systems  under  independent  influence  from  Zernikes  2-6.  Middle:  the  real 
part  of  the  Optical  Transfer  Functions  (OTFs).  Bottom:  the  imaginary  part 
of  the  OTF. 


aberrations  is  compared  to  the  OTF  under  the  influence  of  the  first  6  Zernikes  independently. 
The  unique  effects  created  by  each  Zernike  mode  provide  a  means  for  distinguishing  the 
presence  of  one  mode  over  another.  Here  the  OTF,  the  Fourier  transform  of  the  image, 
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contains  the  information  needed  to  estimate  the  amount  of  each  Zernike  parameter  present 
in  the  optical  system.  Granted,  this  is  a  very  simplified  case,  but  it  demonstrates  the 
relationship  between  pupil  phase,  which  cannot  be  measured,  and  image  intensity,  which 
can  be  measured  directly.  Consequently,  this  theory  reveals  the  possibility  of  estimating 
pupil  phase  from  image  intensity  measurements. 

2.5  Summary 

The  chapter  began  with  a  discussion  on  parameter  estimation  which  concluded  that 
parameter  estimation  problems  require  a  probabilistic  mapping  from  the  parameter  space 
to  the  observation  space,  pr|a(R|A).  Additionally,  parameter  statistical  properties  such  as 
mean  and  variance  are  helpful,  while  a  probability  distribution  for  the  parameters,  pa(A), 
is  highly  desired.  In  the  case  of  the  wavefront  curvature  estimator,  the  observation  space, 
R,  is  the  image  intensity.  The  wavefront  phase  was  parameterized  using  the  well  known  set 
of  Zernike  polynomial  coefficients. 

I  carefully  reviewed  the  origin  of  Kolmogorov’s  turbulence  model  and  how  it  is  related 
to  atmospheric  phase  fluctuations  in  the  optical  wavelength  range.  The  Kolmogorov  model 
was  then  modified  outside  of  the  inertial  range  to  produce  the  well  known  von  Karrnan 
turbulence  model.  The  von  Karrnan  statistic  was  used  to  derive  both  phase  power  spectral 
density  and  structure  functions.  From  the  turbulence  model,  I  was  able  to  deduce  that 
the  Zernike  coefficients  can  be  modeled  by  zero  mean  Gaussian  distributions.  Additionally, 
from  Noll’s  efforts,  the  variance  in  each  Zernike  coefficient  is  related  to  atmospheric  seeing 
conditions  and  the  diameter  of  the  optical  system.  Under  these  assumptions,  the  mean  and 
variance  of  each  coefficient  fully  describes  its  distribution,  pa(A ). 

Finally,  a  linear  model  for  optical  wave  propagation  was  discussed.  Goodman’s 
background  on  the  derivation  of  the  linear  model  revealed  the  assumptions  necessary  to 
view  the  optical  system  and  free  space  propagation  as  a  linear  system.  The  linear  model  led 
to  the  optical  transfer  function  and  its  relationship  to  the  pupil  phase.  The  OTF  provides 
a  means  for  linking  the  observed  image  intensity  to  the  optical  field  in  the  aperture,  the 
final  ingredient  needed  to  begin  constructing  a  wavefront  sensor. 
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3.  The  Discrete  Model 


The  optical  field  and  the  optical  system’s  influence  on  that  field  are  naturally  continuous 
electromagnetic  wave  phenomenon.  For  this  reason,  the  derivations  and  discussions  have 
been  based  on  continuous  variables  and  functions.  However,  there  are  two  key  reasons 
for  converting  the  set  of  continuous  mathematical  constructs  into  a  discrete  parallel.  First, 
simulation  using  a  PC  requires  that  all  the  continuous  models  be  converted  to  some  discrete 
approximation.  Second,  and  perhaps  most  importantly,  the  digital  imaging  system  is  a 
naturally  discrete  system.  Understanding  the  discrete  nature  of  this  type  of  system  and 
its  interface  to  the  environment  allows  for  better  mathematical  representation  than  simply 
sampling  or  approximating  some  analog  equivalent. 

This  chapter  begins  by  establishing  discrete  versions  of  the  continuous  reference  frames 
and  propagation  integrals  provided  in  Chapter  [2]  After  establishing  a  discrete  version  of  the 
linear  systems  techniques  from  Chapter  [2j  the  discussion  focuses  on  modeling  the  Charged 
Coupled  Device  array  in  the  image  plane.  Modeling  each  detected  pixel  as  a  stochastic 
process  will  provide  the  final  ingredient  for  the  estimator:  a  probabilistic  map  from  the 
detected  image  intensity  to  specific  modes  in  the  wavefront  phase.  Once  the  probabilistic 
map  is  in  place,  the  concept  of  an  image  projection  is  introduced.  A  mathematical  construct 
called  the  image  projection  operator  is  used  to  describe  the  process  of  extracting  image 
projections  from  a  set  of  CCD  arrays.  The  image  projection  is  then  combined  with  the 
concepts  of  a  discrete  linear  system  and  the  detected  image  probability  map  to  form  the 
wavefront  curvature  estimator.  The  estimator  derived  here  provides  the  mathematical 
foundation  for  the  wavefront  curvature  sensor.  All  that  will  remain  will  be  to  establish  a 
fast  and  efficient  algorithm  for  evaluating  the  estimator  expressions. 

3.1  The  Discrete  Reference  Frame 

In  order  to  discuss  the  discrete  image  space,  it  is  best  to  begin  by  defining  a  complete 
set  of  discrete  variables  and  definitions  around  the  simple  imaging  system  described  in 
Section  |2.4[  Many  of  the  variables  and  reference  frames  introduced  here  will  be  referred 
to  throughout  the  remainder  of  the  dissertation.  Figure  |3.1|  provides  an  overview  of  the 
object,  aperture  and  image  plane  axes  labeling  convention.  Each  plane  is  divided  into 
grids  of  sample  points  or  pixels.  The  pixel  grids  are  built  around  the  requirements  that 
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Object  Plane  Aperture  Plane  Image  Plane 


Figure  3.1  Axes  labeling  convention. 


grids  are  equispaced  Cartesian  meshes,  apertures  are  circular  and  an  even  number  of  sample 
points  spans  each  side  of  the  square  grid  covering  the  aperture.  Actual  dimensions  of  a  grid 
pixel  are  superficial  to  functions  that  manipulate  arrays  of  discrete  data.  In  such  cases,  the 
associated  index  variables,  [n,  m\  or  [u,  v\  for  instance,  will  be  used  versus  discrete  increments 
of  (x,  y )  and  (£,  77)  to  provide  a  more  general  description.  When  discrete  increments  of 
continuous  variables  are  required,  the  distinction  from  continuous  variables  will  be  made  by 
replacing  parentheses  with  brackets.  For  instance,  the  function  /( x)  is  assumed  to  exist  for 
all  x  while  the  same  function  denoted  f[x\  is  meant  to  indicate  the  values  of  /  over  some 
sampled  set  of  x  values.  Begin  with  the  notation  for  the  continuous  aperture  field  V  with 
arbitrary  amplitude,  Ap.  and  phase,  Pp: 

V  (x,  y\  a,  Rp)  =  AP  (x,  y)  WP  (x,  y;  RP)  exp  {j (x,  y;  a)}  .  (3.1) 

Bold  index  and  coordinate  variables  may  be  used  to  compact  notation  where  possible.  For 
instance,  the  bold  variable  x  is  the  compact  representation  for  the  coordinate  pair  (x,y): 

V  (x;  a,  RP)  =  AP  (x)  WP  (x;  RP)  exp  {]P^  (x;  a)}  .  (3.2) 

Recall  that  the  pupil  phase  can  be  represented  by  a  series  of  weighted  Zernike  polynomials: 

OO 

p V  (x:  a)  =  ^2  aiZi  (x,  RP) ,  (3.3) 

i=  1 

where  a  =  the  infinite  set  of  Zernike  coefficients.  (3-4) 
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Assuming  that  the  pupil  is  circular,  the  window  function,  Wp  (x,y;  Rp),  has  amplitude  1 
over  the  circle  with  radius  Rp  and  zero  outside  the  circle.  The  discrete  coordinates  [ n,m ] 
can  be  derived  from  the  continuous  coordinates  via  the  relationship:  (x,y)  =  {nAx,mAy). 
Thus,  in  the  discrete  coordinate  frame,  the  aperture  field  with  atmospheric  phase  aberrations 
is  given  by: 


V  [ n ,  m;  a,  Rp]  =  Ap  [ n ,  m]  Wp  [n,  m ;  Rp]  exp  {j [ n ,  m;  a]}  . 


(3.5) 


Once  again,  more  compactly: 


V  [n;  a,  RP]  =  AP  [n]  WP  [n;  RP]  exp {jP^,  [n;  a]}  . 


(3.6) 


In  simulation,  the  plane  grids  must  be  comprised  of  equispaced  Cartesian  samples.  The 
circular  aperture  weighting  function  is  modeled  as  accurately  as  possible.  The  diagram 


in  Figure  3.2  demonstrates  a  16  x  16  aperture  weighting  function  Wp.  Notice  that  pixels 
along  the  edge  of  the  aperture  mask  are  weighted  proportional  to  the  amount  of  the  aperture 
included  within  the  area  of  the  pixel. 
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Figure  3.2  Example  aperture  mask  for  a  16  x  16  pixel  aperture  grid. 
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The  Background  chapter  closed  with  a  derivation  of  the  linear  systems  model  for 
propagation.  The  linear  systems  model  depicted  the  intensity  impulse  response  of  a  simple 
thin  lens  as  the  magnitude  squared  of  the  Fraunhofer  diffraction  integral  applied  to  the 
aperture  window  function: 


v) 


exp  {j  2T  (£2  +  ??2) }  JJ  WP(\siX,  X Siy\ Rp)  x 

— oo 

exp  {— j27r  (£x  +  rjy)}  dxdy 
Wp(\siX,  Xsiy ;  Rp)  exp  {-]2p  (£ x  +  rjy)}  dxdy 


(3.7) 


(3.8) 


The  magnitude  of  the  leading  complex  exponential  is  unity.  The  remaining  integral  is  a 
scaled  Fourier  transform  which  I  will  denote  by  Ts  {•}.  Using  this  integral  transform,  the 
intensity  impulse  response  for  the  complex  pupil  expression  can  be  formed: 


h(£,r};a,RP)  =  \TS  {V  (x;  a,  RP)}\' 


(3.9) 


The  intensity  in  the  image  plane  is  then  the  convolution  of  the  image  intensity  predicted 
by  geometric  optics  and  the  intensity  impulse  response: 


Ii(£,ir,a,RP)  =  k  //  \h(£  -  d,  ?/  -  /3;  a,  RP)\  Ig(ot,  /3)dad/3 


(3.10) 


If  Ig  is  a  point  source  then  the  intensity  impulse  response  is  the  image  intensity: 


Vi  a)  Rp)  =  —  a,  i]  —  /3;  a,  Rp)\  5(a,/3)dad(3, 


I(Z,V\a,RP)  =  h  (£,  r);  a,  RP) 


(3.11) 

(3.12) 


where  image  intensity  scaling  has  been  neglected.  Here  I  have  introduced  the  new  variable 
I  which  is  the  intensity  impulse  response  for  a  pupil  with  atmospheric  phase  contributions. 
I  is  often  referred  to  as  a  point  spread  function  or  PSF,  however,  since  the  wavefront  sensor 
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will  be  simulated  using  a  point  source,  the  variable  I  will  henceforth  serve  as  the  expected 
image  variable. 

Using  the  discrete  reference  frames  and  variables,  it  is  straightforward  to  convert  the 
continuous  propagation  integral  to  its  discrete  counterpart.  The  Fourier  integrals  and  their 
respective  discrete  forms  are  presented  here.  Recall  the  two-dimensional  Fourier  transform 
pair: 

OO 

F{D}  =  V(fx,fr)  =  JJ  D(x,y)exp{- j2Tr(f xx  +  fYy)}dxdy,  (3.13) 

—  OO 

OO 

F~l{V}  =  D(x,  y )  =  ~~~^2  jj  V(fx,  Iy )  exp{j2vr(/.YX  +  fyy)}dfxdfy.  (3.14) 

—  OO 


The  discrete  counterparts  of  these  operations  are  given  here  as  the  discrete  Fourier  series, 
VTS{-},  and  the  discrete  Fourier  transform,  V!FT{-}: 


VTS{D}  =  V[fx,  fy]  = 


VTT{D}  =V[u,v]  = 


VR1  1{V}  =  D[n,m]  = 


OO  OO 

E  E  D[nAx ,  mAy]  x 

n=— oo  m= — oo 

exp  {-j27r(nAx/x  +  mAyfy)}  AxAy, 

N/  2-1  N/  2-1  ^ 

yy  D[n,  m]  exp  <  —  j  —  (nu  +  mv) 
n=-N/2m=-N/2  ^ 


1 

N* 


N-l  N—l 


exp 

u= 0  v=0 


(nu  +  mv] 


(3.15) 

(3.16) 

(3.17) 


Sampling  the  (x,y)  coordinate  frame  in  (3.7)  produces  the  discrete  Fourier  series  form  for 
h: 


OO  OO 

h  [£,  v]  =  E  E  Wp(XsinAx,  XsimAy-,  Rp )  exp  {— j27r  ((nAx  +  ymAy)}  AxAy 

n=— oo  n=— oo 

(3.18) 

The  discrete  Fourier  series  representation,  evaluated  over  an  appropriate  set  of  [£,  r/]  loca¬ 
tions,  can  be  posed  as  a  discrete  Fourier  transform.  There  is  only  the  minor  difference 
of  axis  scaling  between  the  scaled  Fourier  transform,  27-"s  {•},  and  the  Fourier  transform. 
This  difference  will  be  accounted  for  in  a  careful  relationship  between  aperture  sampling 
and  image  sampling  conventions.  The  relationship  between  sample  spacing  in  the  aperture 
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plane  and  sampling  in  the  image  plane  is  governed  by  the  Nyquist  sampling  theorem.  The 
choice  of  Ax,  and  Ay  will  be  based  on  the  dynamics  of  the  field  in  the  aperture.  Applying 
Nyquist  to  Ax,  Ay  and  the  size  of  the  aperture  the  maximum  sample  spacing  in  the  image 
plane  is  given  by: 


A  ^  A.Sa  .  ASa 

=  2NAx  ’  =  ~2NAy  ’ 

N Ax  =  Dp  =  aperture  diameter, 


(3.19) 

(3.20) 


Recall  the  relationship:  Ax  =  Ay  =  ^f,  from  the  Background  chapter.  Making  the 
appropriate  substitutions  for  sampling  dimensions  in  the  aperture  and  image  planes  yields: 


h  [wA£,  vA rj\ 


h  [uA£,  vA 77] 


h  [uA£,  vA 77] 


h  [rtA£,  vAy] 


'  j  x  r  \  \  Ax  A y 

2^  2^Wp  Asjn— ,A Sim—;Rp  x 


N_  _N_ 
'  2  2 


f  (  .  ..  Ax  .  Ay  A)  Ax  Ay 

exp  <  — j27t  uA£n- - b  vAijm- —  >  - — - — , 

^  \  Xsi  A  Si  J  J  A  Si  Xs{ 

iV,  iV  _1 

\  ^  tt r  \  Ax  Ay 

2^  2^  Wp  Asjn— ,  Asjm— x 

_N_  _JV  L  1  1  - 

2  2 

(  /  A  Si  Ax  A  Si  Ay\)  Ax  Ay 

exp  }  -j2t r  u - -n- - b  v - m— ^  ^ , 

1  l  2N Ax  Xsi  2N Ax  Xsi  J  J  Xsi  Xsi 


AxAy 


_ -I  iV _ -I 

2  2  r  2i 

EE  ITp  [nAx,  mAy;  Rp ]  exp  <  —  j  ( un  +  vm) 

1ST  N  v- 


%>  —K  —K 
2  2 


— - — jVDTlWp  [nAx,  mAy;  Rp]}. 


(3.21) 


(3.22) 

1 

(3.23) 

(3.24) 


Thus,  the  discrete  transformation  of  the  aperture  field  into  the  image  plane  intensity  is 
given  by: 

I  [uA£,  vAy;  a]  =  ^'l^VJ:T{'P  [nAx,  mAy ;  a]}  ,  (3.25) 

(A  Si) 

where  the  dependence  of  the  pupil  function  on  Rp  remains,  but  the  variable  has  been 
dropped  to  compact  the  notation.  The  bold  image  plane  variable  I  will  be  substituted 
for  the  continuous  function  I  to  indicate  that,  while  I  is  defined  over  the  space  of  all  real 
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numbers,  the  discrete  variable  I  is  defined  on  a  finite  set  of  pixel  locations  S: 

I  [u,  v;  a]  =  /  [uA£,  vAij]  a] .  (3.26) 

S  is  the  set  of  all  pixel  locations  available  in  a  CCD  image.  The  bold  I  may  also  appear 
with  or  without  index  variables.  When  I  is  presented  with  index  variables  the  expression 
is  meant  to  indicate  a  single  location  in  the  image  set.  When  I  is  presented  without  index 
variables,  the  expression  refers  to  the  entire  image  set: 

I  [a]  =  {%,  D;a]:ti,De5}.  (3.27) 

Finally,  the  discrete  version  of  the  OTF  is  given  by  the  normalized  discrete  Fourier  transform 
of  I  where  all  atmospheric  parameters  are  zero: 

VTT{  1} 

Ei[u]  ' 

u 

To  reference  a  specific  location  in  the  OTF  frequency  domain,  I  will  reuse  the  aperture 
plane  index  variable  n. 

The  discrete  reference  frames  and  linear  systems  operations  presented  here  can  be 
used  to  simulate  wave  optics  phenomenon.  This  simulation  is  intended  to  provide  some 
demonstration  of  wavefront  sensor  performance.  The  fidelity  of  the  simulation  will  suffer  if 
the  noise  characteristics  in  the  CCD  are  not  included.  Physically,  noise  occurs  during  the 
photon  conversion  process.  Thus,  to  model  CCD  noise,  some  stochastic  process  must  be 
included  in  the  model  to  distinguish  the  detected  image  from  the  ideal  image  I. 

3.2  The  Detected  Image 

The  interface  between  the  environment  and  the  wavefront  sensor  occurs  in  the  charge 
coupled  device  (CCD).  If  the  CCD  is  sensitive  enough  and  the  exposure  time  is  short, 
then  the  detector  may  be  modeled  as  a  photon  counting  device.  In  this  case,  intensity 
detected  in  the  image  plane  is  a  count  of  discrete  events  as  each  incoming  photon  interacts 
with  the  CCD.  Therefore,  the  detected  image  is  spatially  discrete,  as  it  is  formed  from  an 
array  of  image  sample  points,  and  discrete  in  the  level  of  intensity  measured  at  each  sample 
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point.  The  largest  noise  contribution  is  due  to  random  photon  arrival,  often  referred  to 
as  shot  noise.  Under  the  influence  of  shot  noise  only,  the  CCD  image  can  be  considered  a 
grid  of  photon  bins  where  the  detected  image  is  a  count  of  Poisson  distributed  events.  Let 
d[u,  v\  be  a  random  variable  representing  the  photon  count  with  added  shot  noise  in  a  single 
CCD  pixel.  Let  D  [u,  v\  be  a  realization  of  the  random  variable  d[u,  v\.  Note  the  use  of 
the  common  convention  that  lower  case  variables  indicate  random  variables  and  upper  case 
variables  indicate  realizations  of  those  random  variables.  The  value  of  any  single  detected 
pixel  is  given  by: 

d[u,  v\  =  Poisson  {I  [it,  v;  a]}  .  (3.28) 


Thus,  each  pixel  in  the  detected  image,  d[it,  v\,  is  a  Poisson  random  variable  with  its  pa¬ 
rameter  being  the  associated  value  from  the  image  predicted  by  the  discrete  linear  system 
model  with  input  parameter  set  a.  Using  this  probability  model  for  each  pixel  in  d,  the 
conditional  density  for  a  detected  pixel  given  an  aperture  field  constructed  from  the  set  of 
Zernike  coefficients  A  is: 


Pd[tM>]|a(D[u,u]|A) 
where  D[u,  v] 
and  I [u,v;  A] 


I [u,  v ;  A]13^’-0!  exp  {—  I[iz,  v,  A]} 

D[u,  v\\ 

photo  detection  events  in  image  pixel  [zt,  v\, 
Poisson  rate  function  is  the  noiseless  image. 


(3.29) 

(3.30) 

(3.31) 


There  may  be  multiple  image  planes  associated  with  a  subaperture.  Suppose  that  the 
optical  path  is  split  such  that  there  are  Nd  imaging  planes.  Each  image  and  its  respective 
set  of  pixel  locations  will  be  indexed  with  the  subscript  i: 


D  i 

=  {Dj[u]  :  u  6  Si}  , 

(3.32) 

where  Si 

=  1th  sample  space  of  pixels, 

(3.33) 

and  u 

=  W,v]. 

(3.34) 

The  combined  set  of  detected  images  and  expected  images  are  denoted  Dy  and  Iy  respec¬ 
tively: 

Nr, 


Dy  =  jDi. 


(3.35) 


i=  1 
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The  estimator  will  require  a  joint  density  for  all  pixels  in  all  image  planes.  It  is  possible 


to  extend  the  pdf  in  (3.29)  above  to  a  joint  pdf  including  pixels  from  all  image  arrays  by 


assuming  that  each  pixel  is  independently  and  identically  distributed  (i.i.d.)  Poisson.  For 
independent  random  variables,  the  joint  density  is  the  product  of  the  marginal  densities. 
Take  the  product  of  every  pixel  in  all  images  and  the  conditional  density  becomes: 


Nr, 


Pdu|f 


(DuiA) = n  n 


i= i  ueSi 


T[u;  A]udujexp(_Iz[u;A]) 
Djul! 


(3.36) 


The  maximum  likelihood  estimator  a mi  is  formed  by  assuming  the  pdf  for  the  parameter 
set  is  uniform  and  maximizing  the  log-likelihood  expression  over  the  range  of  the  parameter 


set  A  as  in  (2.19): 


max 

A 


{in  {pd|a(D|  A)}  } 


A=a.n 


T  I  E  E  Di[u]ln{li[u;  A]}  -  I* [u;  A]  +  ln{Dj[u]!} 

i= 1  ue5i 

{JVD 

g^D([u]ln{I([u;A]}-I([u;A] 

i= 1  ueS; 


A=a„ 


(3.37) 

(3.38) 

(3.39) 


A=a„ 


where  the  detected  image  D  represents  the  observation  vector,  R  in  (2.19).  Also,  the 


parameter  A  in  (2.19)  has  been  replaced  with  the  entire  set  of  Zernike  coefficients,  a.  If 


the  pdf  for  the  parameter  set  is  not  uniform  then  the  log-likelihood  includes  the  additional 
term,  ln{pa(A)}.  Recall  that  the  phase  at  each  point  within  a  single  phase  screen  is  the 
sum  of  phase  contributions  along  some  optical  path.  Furthermore,  each  Zernike  coefficient 
is  computed  by  a  projection  sum  using  the  resulting  phase  screen  points.  Given  that  the 
Zernike  coefficients  are  formed  from  the  sum  of  many  zero  mean  random  variables,  it  is  rea¬ 
sonable  to  assume  that  the  Central  Limit  Theorem  applies  and  thus  each  Zernike  coefficient 
is  zero  mean  Gaussian  with  a  variance  that  depends  on  the  atmospheric  turbulence  model. 
Therefore,  let  the  joint  pdf  for  the  parameter  set  be  a  multivariate  Gaussian: 


Pa(A)  = 


1 


AAs  1At ' 


(2vr) 2  (det(Aa))2 


exp 


(3.40) 
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where  the  matrix,  Aa,  is  the  covariance  matrix  for  the  jointly  Gaussian  parameter  set  a. 


An  example  of  the  matrix,  Ag,  for  parameters  02  through  an  was  generated  in  Table  2.5 
Section 


2.3 


Substituting  ln{pg(A)}  and  In  {pdis(D|A)}  into  (2.17),  the  maximum  a 


posteriori  estimator  kmap  is  given  by: 


max 

A 


?X  (ln{Pd|a(D|A)}  +ln{pa(A)}} 


A — a„ 


max 

A 


max 

A 


Nr 


Y  Y  D?  M ln  I1*  Iu;  A] }  -  I?:  [u;  A] 

i—  1  uGtSi 

1 


+ 


In 


AA-" 1  A* 


(2tt)^  (det(A5))2 

Y  Y  D*[u] ln  {r4u;  A] }  -  Ij[u;  A] 


A, 


i=l  u£Si 


A — an 

AAg1A< 


(3.41) 


(3.42) 


(3.43) 


A — a  77 


Using  the  linear  system  model  and  the  CCD  noise  model,  ML  and  MAP  estimators 
have  been  derived.  These  estimators  make  use  of  the  final  ingredient  required  for  a  parame¬ 
ter  estimating  wavefront  sensor,  the  conditional  probability  mapping,  pr|g(R|  A)  introduced 
in  Section  |2.1|  While  these  estimators  will  yield  good  performance,  they  may  not  be  ca¬ 
pable  of  being  implemented  in  a  real  time  wavefront  sensor  due  to  CCD  read  out  time  and 
mathematical  complexity.  The  following  sections  discuss  modifications  to  the  estimators 
that  offer  trade-offs  between  complexity  and  performance. 


3.3  The  Image  Projection 

The  wavefront  sensor  design  must  map  from  the  detected  image  intensity  to  the  aper¬ 
ture  phase.  This  implies  that  the  first  step  in  the  wavefront  correction  process  will  be  to 
detect  and  store  the  intensity  in  the  image  plane.  Though  the  two-dimensional  image  con¬ 
tains  a  wealth  of  data,  the  processing  time  required  to  read  out  and  evaluate  that  number 
of  data  points  can  exceed  the  maximum  bandwidth  for  a  real  time  adaptive  optics  system. 
The  time  spent  reading  out  the  charge  from  each  pixel  location  in  the  CCD  can  become 
the  largest  portion  of  the  time  required  for  processing  the  image  signal.  Additionally,  read 
out  noise  is  proportional  to  the  number  of  pixels  read  from  the  CCD.  The  read  out  noise 
model  will  be  introduced  in  the  next  section.  Compressing  the  number  of  data  points 
without  losing  vital  information  for  wavefront  sensing  can  become  a  time  saving  step  if 
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not  a  necessity  when  trying  to  achieve  real  time  adaptive  optics  system  bandwidths.  One 
technique  for  reducing  the  number  of  data  points  is  to  project  the  image  plane  into  a  vector 
format  prior  to  reading  the  CCD.  The  projection  operation  can  be  performed  quickly  on 
the  chip  using  simple  shifting  and  summing  of  data.  The  simplest  image  projections  are 
the  one-dimensional  projections  along  either  the  u  or  v  axis: 


N—l 

D*'M  =  YI)\-u’rn}’ 

(3.44) 

m= 0 

N—l 

D>]  =  ^D[n,u], 

(3.45) 

n= 0 

which  are  the  sums  of  the  two-dimensional  image  in  each  direction. 


Image  Projections  and  the  OTF.  The  true  benefit  of  creating  the  pair  of  "vector¬ 


ized"  images  can  be  realized  by  first  recalling  the  discrete  Fourier  transform  (3.16).  The 


two-dimensional  DFT  is  a  separable  summation.  It  is  possible  to  perform  the  DFT  for  the 
zeroth  order  frequency  in  each  direction  independently.  These  operations  reveal  a  special 
benefit  of  the  vectorized  image: 


N- 1 


V[n,  0]  =  Y 

u= 0 
TV-1 

=  E 


V[0,m]  = 


N—l 

Y  DKv] 

u= 0  L  i;=0 
N—l 

Y  DKv] 

u=0  L  v=0 
N- 1 

Y  D?iM  exP  j  j 

u= 0 

N- 1  r  2n 

Y  D  "  Mexpj  j— mv 

v=0  '■ 


exp<J  j^  (nu  +  m( 0)) 


2vr 

exP  j  }—nu,  }  , 


2v r 
~N 


nu 


(3.46) 

(3.47) 

(3.48) 

(3.49) 


This  shows  that  performing  a  one-dimensional  Fourier  transform  on  the  vectorized  image 
produces  the  zeroth  order  frequency  vector  from  the  two-dimensional  transform.  To  realize 


the  importance  of  this,  recall  the  OTF  examples  in  Figure  2.8  The  OTF,  in  this  case, 
was  a  simple  Fourier  transform  of  the  image.  Vectorizing  the  image  on  the  CCD,  then 
performing  a  one-dimensional  Fourier  transform  on  the  result  produces  a  slice  of  the  OTF 


much  like  the  cross-section  plots  provided  in  Figure  3.3  Given  the  uniqueness  in  the  OTF 
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Diffraction  Limited  Tilt-x  Tilt-y  Defocus  Astigmatism- xy  Astigmatism 


-0.4 -0.2  0  0.2  0.4  -0.4 -0.2  0  0.2  0.4  -0.4 -0.2  0  0.2  0.4  -0.4 -0.2  0  0.2  0.4  -0.4 -0.2  0  0.2  0.4  -0.4 -0.2  0  0.2  0.4 


Figure  3.3  Simulated  OTFs  for  a  diffraction  limited  optical  system  (column  1)  and  sys¬ 
tems  under  independent  influence  from  Zernikes  2-6  (columns  2-6  respec¬ 
tively).  Rows  1  and  3:  real  and  imaginary  parts  of  the  OTFs  respectively. 
Rows  2  and  4:  projections  corresponding  to  dashed  section  lines  in  rows  1  and 
3. 

cross-section  under  the  influence  of  some  limited  set  of  Zernike  modes,  it  may  be  possible 
to  distinguish,  and  therefore  estimate,  the  effects  from  each  of  the  Zernikes.  Thus,  the 
vectorized  image  promises  a  much  faster  read  out  time  from  the  CCD  along  with  some 
amount  of  information  useful  for  estimating  pupil  phase. 


A  General  Image  Projection  Operator.  A  more  general  description  of  the  projection 
operation  is  required  before  including  it  in  the  estimator  models.  The  notation  for  key 
quantities  and  reference  frames  within  the  discretized  optical  system  model  will  be  used  to 
define  the  general  form  of  the  image  projection  operation.  The  image  projection  operator 
accepts  an  array  or  set  of  arrays  for  input  and  returns  a  vector.  Additionally,  at  least 
two  images  are  formed  in  separate  CCD  arrays.  Given  multiple  arrays,  each  CCD  may  be 
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rotated  such  that  its  Vi  unit  vector  forms  a  unique  angle,  9i,  with  the  original  image  plane 
v  direction.  The  CCD  image  arrays  are  given  by: 


D*A:  [ui,Vi]  =  pixel  from  CCD  image  array  i,  (3.50) 

180 

9i  =  - cos-1  (v  ■  Vi)  [degrees].  (3.51) 

7 r 

The  image  projection  operator  v(-)  accepts  the  set  of  image  arrays:  {D ^  :  i  £  1,2, ...,  IVd} 
for  input.  First,  each  input  array  is  windowed  by  discarding  all  pixels  except  the  Nw  x  Nw 
pixel  region  centered  around  the  optical  axis.  Since  the  windowed  pixels  are  the  only  pixels 
of  interest  from  the  larger  set  Si,  continuing  to  use  the  old  index  values  [ui,Vi]  becomes 
unnecessary.  It  is  more  convenient  to  renumber  the  windowed  array  using  a  1  to  N\y  row, 
column  numbering  system.  The  resulting  set  of  windowed  pixels  are  summed  along  the  Vi 
direction  according  to  a  set  of  starting  and  ending  row  number  pairs  contained  in  the  set 
s: 

s  =  {(1,  s2) , ...,  (sjvs-i,  Sjvs) ,  Nw)  .  (3.52) 

The  last  entry  in  the  set  s  does  not  provide  a  summation  pair,  rather,  it  identifies  the 
window  length  in  pixels,  Ny y.  For  notational  simplicity,  if  the  Nw  entry  in  s  is  omitted,  it 
is  assumed  that  the  final  ordered  pair,  (sjvs_i,  sjvs),  ends  on  the  window  length  index,  Nw 
(i.e.  sjvs  =  Nw)-  The  following  are  examples  of  s  for  a  6  x  6  window:  s  =  {(1,  6)}  identifies 
the  projection  operation  which  sums  along  the  entire  windowed  region;  s  =  {(1,3),  6} 
indicates  that  the  projection  only  includes  the  first  3  rows,  but  the  window  size  is  6  x  6, 
and  s  =  {(1,3) ,  (4,6)}  describes  the  projection  operation  which  sums  the  upper  half  and 
lower  half  (rows  1  to  3  and  4  to  6)  of  the  window  into  two  separate  vector  projections.  The 
set  s  will  be  included  as  a  preceding  subscript  on  v(-)  whenever  the  projection  operation 
requires  clarification.  The  resulting  vector  output  is  a  concatenation  of  Nv  projection  sums 
where  Nv  is  the  number  of  ordered  pairs  in  s.  The  variable  l  £  {1,2,...,  NvNw}  indexes  the 
location  in  the  resulting  vector.  Given  this  convention,  the  general  form  of  the  projection 
operator  output  with  individual  vector  location  index  is  either  denoted: 


s  Vi 


(Di^,  D2,6)2 , 


...,DNr),0N 


(3.53) 
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or  by  the  more  compact  notation: 


svt  (Dj) , 

No 

where  Dy  =  |J  D^. . 


(3.54) 

(3.55) 


2—1 


As  an  example,  consider  the  operation  which  employs  whole  plane  projections  from  two 
CCDs  at  0°  and  90°  rotations  using  a  6  x  6  window.  The  It h  location  in  the  projection 
operation  output  is  denoted: 

{(1,6)}VZ  (Di,o,  D2,9o)  •  (3.56) 

Given  the  general  form  for  the  projection  operator  it  is  possible  to  develop  a  pair  of 
projection  based  estimator  expressions.  The  joint  density  for  an  image  projection  is  given 
by: 


„  ,  WD  ,,A,  At”'  sV;  (Itj  [A]) sVt 'Duhxp  (sv,  (-Iij  [A])) 

Psv(du)|a(sv  (Dy)  |A)  —  |  J 


'Nr- 


(3.57) 


i=i 


svi  |J{(DiA[u])!  :ug5,;} 


Ki=l 


The  notation  (D i^i  [u])!  indicates  that  the  factorial  is  applied  at  each  pixel  location  u  £Si 


Substituting  (3.57)  into  (2.19),  the  form  for  the  maximum  likelihood  estimator  is  given  by: 


max 

A 


N,NW 

^2  S vz  (Du)  In  {  svi  (Iu  [A] ) }  -  svi  (ly  [A] )  + 

i=i 

in  |  svi  {D ifi.  [u]!  :  u  | 


A — 3-ml 


max 

A 


'NVNW 


^2  sVl  (Du)ln{svz  (Iu  [A])}  -  sv;  (iu  [A]) 
1=1  J 

Similarly,  the  form  for  the  maximum  a  posteriori  estimator  is  given: 


(3.58) 

(3.59) 


A — am; 


max 

A 


NyNw 

E 

.  1=1 


sVj  (Du)  In  { sv;  (Iu  [A])}  -  svz  (ly  [A]) 


AA^1  A* 


(3.60) 


A — a.  77 
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3-4  Other  Detected  Image  Models 

While  the  Poisson  pdf  model  for  an  image  pixel  is  sufficient,  it  does  not  account  for 
all  forms  of  CCD  noise.  In  addition  to  shot  or  Poisson  noise,  read  noise  is  always  present. 
While  shot  noise  is  due  to  the  random  arrival  of  photons  and  is  the  basis  of  the  Poisson 
model,  read  noise  is  generated  by  the  buffers  and  amplifiers  used  to  digitize  the  photon 
count  voltage.  Poisson  noise  is  signal  dependent  because  the  Poisson  rate  parameter  is  the 
noiseless  image  value.  Read  noise  is  independent  of  the  signal.  Instead,  read  noise  depends 
on  the  number  of  pixels  read  from  the  CCD. 


Accounting  for  Read  Noise  in  the  Image  pdf.  Let  nro[u]  represent  the  read  noise  in 
a  CCD  pixel.  Let  the  read  noise,  nro [u] ,  be  a  zero  mean  Gaussian  process  with  variance 
o2ro.  Considering  the  effects  of  read  noise  and  shot  noise  to  be  independent,  the  pixel  with 
combined  shot  noise  and  read  noise  effects,  dro[u],  is  modeled  as  the  sum  of  independent 
Poisson  and  Gaussian  random  variables.  The  pdf  of  the  sum  z  =  x  +  y  is  a  convolution  of 
pdfs  of  x  and  y  when  x  and  y  are  independent  [22] : 


Pd[u]|a(D  [u]  |  A) 


Pnro[u](Nro  [u]) 

d ro  [u] 


Pdro[u]|a(Dro  [u]  |A) 


,  r  ,  I  |"u;  Al 
exp  {  —I  [u;  A]  }  L_ 


D[ul 


U  ' 


exp 


7T  <J  r 


Nro  M 
2  0-2 


d  [u]  +  nro  [u] , 


\/27Ti 


1  OO 

L-  v 

'K&ro  r  i  , 


exp 


(Dro  [u]  -  D  [u])" 


D[ul=0 


2a'L 


,  r  ,  I  [u;  Al 
exp{-l[u;A]}^  J 


t  i  DM 


U  ' 


(3.61) 

(3.62) 

(3.63) 


(3.64) 


This  expression  must  be  modified  slightly  to  account  for  the  discrete  nature  of  the  CCD 
electronics.  The  A/D  conversion  process  will  apply  a  rounding  function  to  the  continuous 
set  of  values  allowed  by  the  Gaussian  pdf.  This  will  have  the  effect  of  forcing  dro  [u]  to  take 
on  only  nonnegative  integer  values.  The  A/D  conversion  can  be  included  in  the  statistical 
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model  as  an  integration  around  each  integer  value  for  the  Poisson  random  variable: 


00  T  r  t  I D  [“] 

7ikro  S  exp{-l[u;A]}  ^  x 

D[u]=0 
0.5 

J  exp  |  -  |  d£,  for  Dro  [u]  =  0, 

—  OO 

Vlfc  £  exp  {-I  [11;  A]  l1^,  x 
D[u]=0 

Dro[u]+0.5 

J  exP  for  [u]  G  {1,2,3, ...}  . 

Dro[u]— 0.5 

(3.65) 

Unfortunately,  this  distribution  does  not  simplify  and,  although  it  can  be  evaluated  numer¬ 
ically,  it  is  far  too  computationally  expensive  for  use  in  the  sensor  model.  However,  the 
numerical  evaluation  of  dro  [u]  begins  to  look  like  a  biased  Poisson  statistic.  This  charac¬ 
teristic  allows  for  a  far  simpler  mathematical  expression  that  can  be  used  in  the  real  time 
sensor  model. 


Pdro[u]|a(Dro[u]  |  A)  =  < 


Approximating  the  Read  Noise  pdf.  The  combination  of  Poisson  shot  noise  and 
Gaussian  read  noise  can  be  approximated  by  a  biased  Poisson  statistic 


d  [ill  = 


d  ful  +  ain  = 


Poisson  {I  [u;  a]  +  cr2ro }  —  a20, 
Poisson  |l  [u;  a]  +  a20}  . 


(3.66) 

(3.67) 


Using  this  approximate  distribution,  it  is  possible  to  develop  a  new  expression  for  the  MAP 
estimator.  Begin  with  the  property:  the  sum  of  independent  Poisson  random  variables  is 


distributed  Poisson  with  a  mean  equal  to  the  sum  of  the  individual  means.  Using  (3.67) 


the  joint  density  for  all  pixels  in  the  image  projection  conditioned  on  the  set  of  Zernike 
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coefficients  becomes: 


P,v,(du)+»>„|a  (.VI  (Du)  +  <7?„|A)  =  exp{-  [,v,  (Iu  [A])  +uJJ}  X 

[.VI  (Iu[A])+aiyv‘<D'J>+"' 


f  Nr 


(3.68) 


svZ  U  tul!  :  u  + 


oi 


Vi=l 


The  joint  density  for  locations  in  the  image  projection  vector  is  simply  the  product  of  the 
marginals: 


P sv  (du)+er20|a  (  sv  (Du)  +  °Vo|A) 


NVNW 

II  eXP  {-  UVl  (JU  [A])  +  <y'ro]  } 
1=1 

[,v, 


fNr 


(3.69) 


sv2 


U  {d*A  Iu] !  :  u  + 


ot 


v*=l 


The  bold  read  out  variance  <j2ro  indicates  the  product:  1  a20,  where  1  represents  a  vector  of 
ones  of  length  NvN\y.  Taking  the  natural  log  of  the  conditional  density  gives: 


In  {p sv  (du)+cr20|a  (  sv  (^u)  +  °Yol A)  } 


cro  ^ 


NVNW 

y  [sVi  (Dy) + 1 

i=i 

In  {  sv;  (Iu  [A])  +cr2o}  - 


sv;  (Iu  [A])  -  a2ro  - 

In  |  SV;  r |J  {Dj)0i  [u]!  :  u  + 


( T„ 


U=1 


(3.70) 


Recall  the  log  of  the  jointly  Gaussian  prior  density: 
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Combining  these  results  yields  the  MAP  estimator: 
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(3.72) 


where,  once  again,  all  terms  with  no  dependence  on  A  are  discarded  from  the  estimator 
expression. 


3.5  Summary 

The  chapter  began  by  developing  discrete  reference  frames  and  a  discrete  linear  sys¬ 
tems  model  from  their  continuous  counterparts  introduced  in  the  background  chapter.  Then 
a  model  for  CCD  noise  was  provided.  This  led  to  the  idea  that  the  detected  image  is  not 
simply  the  output  of  the  linear  system,  but  rather  that  the  system  output  combined  with 
some  random  noise.  From  there,  the  joint  density  for  a  group  of  pixels  in  several  image 
planes  conditioned  on  the  set  of  atmospheric  phase  coefficients  was  formed.  The  conditional 
density  was  used  to  create  the  first  attempts  at  ML  and  MAP  estimators. 

The  CCD  image  contains  too  many  pixels  for  read  out  in  real  time.  To  limit  the 
number  of  pixels  read  out  of  the  CCD,  the  image  projection  was  used.  The  image  projection 
has  the  benefits  of  shorter  read  out  time  and  decreased  read  noise.  The  ML  and  MAP 
estimators  were  adapted  to  include  the  generic  image  projection  operator.  An  approximated 
read  noise  pdf  was  provided.  Combining  the  read  noise  pdf  with  the  image  projection 
concept  gave  the  final  form  for  the  ML  and  MAP  estimators.  The  chapters  to  follow  will 
detail  an  algorithm  for  a  curvature  sensor  implementation  using  these  estimators  along  with 
sensor  performance  bounding  and  simulated  performance  of  the  curvature  sensor. 
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4-  A  Survey  of  Wavefront  Sensing  Techniques 

This  chapter  provides  a  brief  overview  of  several  wavefront  sensing  methods.  The  discussion 
begins  with  interferometric  techniques,  followed  by  a  review  of  a  method  known  as  phase 
retrieval  or  phase  diversity.  I  introduce  the  role  of  parameter  estimation  in  wavefront  sens¬ 
ing  with  a  description  of  the  Hartmann  sensor.  Finally,  I  have  included  a  more  detailed 
discussion  of  two  parameter  estimation  methods  which  use  image  projections.  The  wave- 
front  sensor  designs  presented  in  this  dissertation  are  modified  versions  of  Cain’s  projection 
based  estimator  pQ. 


4-1  Wavefront  Sensing  through  Interferometry 

Interferometric  techniques  for  measuring  wavefront  phase  involve  interpreting  the  in¬ 
tensity  patterns  arising  from  the  interaction  of  two  or  more  coherent  fields.  Two  fields 
arriving  at  a  point  in  space  and  time  are  perfectly  coherent  if  they  are  very  narrow  band 
(composed  primarily  of  a  single  wavelength)  and  their  phase  variation  is  identical.  Broad¬ 
ening  the  spectrum  and  varying  the  relative  phase  reduces  the  level  of  coherence.  When 
coherent  fields  are  combined,  periodic  fringe  patterns  are  visible  in  the  resulting  intensity. 
The  classic  examples  involve  the  Michelson  interferometer  and  Young’s  double  slit  experi¬ 


ment  shown  side  by  side  in  Figure  [4~lj  These  two  examples  put  to  practical  use  the  effects 
of  temporal  and  spatial  coherence  respectively.  The  level  of  both  temporal  coherence  and 
spatial  coherence  in  a  source  can  determine  the  degree  of  calibration  and  precision  neces¬ 
sary  to  successfully  conduct  an  interference  experiment.  Temporal  coherence  is  inversely 
proportional  to  the  bandwidth  of  the  source.  In  interferometry,  temporal  coherence  is 
quantified  by  the  length  of  time  delay  one  can  impose  on  a  portion  of  a  beam  and  still 
create  measurable  interference  fringes  when  that  delayed  portion  is  recombined  with  the 
original  beam.  The  amount  of  spatial  coherence  is  measured  by  spatially  shifting  a  portion 
of  a  field  and  recombining  it  with  the  original  field.  The  spatial  coherence  is  quantified 
by  the  spatial  separation  beyond  which  the  two  portions  cease  to  create  measurable  inter¬ 
ference  patterns.  Each  style  of  interferometric  wavefront  sensor  gathers  information  about 
the  wavefront  phase  by  sampling  the  intensity  and  applying  software  algorithms  to  detect 
known  characteristics  from  the  interference  patterns. 
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Figure  4.1  At  left:  geometric  interpretation  of  Young’s  double  slit  experiment.  At  right: 
diagram  of  a  Michelson  interferometer. 


The  basic  formula  for  interference  fringe  patterns  can  be  derived  for  both  the  temporal 
and  spatial  cases.  Beginning  with  the  temporal  case,  the  second  order  effects  of  light 
incident  on  the  detector  can  be  described  as  a  function  of  the  path  difference,  h  }24j. 
Consider  the  interaction  of  the  field,  u(f)  with  a  delayed  version  of  itself: 


ID(h) 

where  K\ ,  it  2 

u 

and  c 


^  Adu(f)  +  A"2u 
variable  attenuation  in  each  path, 
optical  field, 
speed  of  light. 


(4.1) 

(4.2) 

(4.3) 

(4.4) 


The  intensity,  Id,  is  formed  from  the  average  modulus  of  the  field.  The  brackets  (•) 
indicate  the  expectation  or  ensemble  average  operation.  Expanding  the  product  inside  the 
expectation  results  in: 


ID(h) 


where  1 0 
and  T (r) 


(K‘i  +  A'f  j/f)  +  2K\  I<2  Re  \  T 


(u (t  +  t) u*  (t))  . 


=  r  (o) , 


(4.5) 

(4.6) 

(4.7) 
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The  self-coherence  function,  T  (r) ,  can  be  written  in  general  form  (24] : 


r(r)  =  T  (r)  exp  |  —j  2ttjT  -  a(r)  |, 

(4.8) 

where  T  (r)  =  T  (r)  , 

(4.9) 

and  a(r)  =  general  phase  function. 

(4.10) 

The  phase  function,  a(r),  and  self-coherence  function  can  be  calculated  for  a  specific  source 
from  its  power  spectral  density.  Substituting  the  general  form  for  T  (r)  yields  the  following 
form  for  Io{h ): 

ID{h)  =  {K\  +  K22)Iq  +  2K±K2T  cos  |  ^  -  a  }  (4.11) 

Assuming  that  the  path  length  difference  is  nearly  zero,  T  (^)  ss  Iq,  and  a  (^)  ~  0.  Mak¬ 
ing  these  substitutions  reduces  the  interference  pattern  to  a  modulated  sinusoidal  pattern 
with  the  argument  being  a  simple  phase  delay  caused  by  the  path  length  difference  h  which 
will  be  denoted,  A 4>(h): 

ID(h )  =  {Kl  +  Kl)I{)  +  2A4ii2/o  cos  {Ac .  (4.12) 

Similarly,  the  interference  due  to  spatial  separation  can  be  found  to  be  |23j: 

/<?(ri,r2)  =  KfTu  (0)  +  KlV22  (0)  +  (4.13) 

2AW12  (^l)  cos  {2,  -  ai2  (rum)  |  ,  (4.14) 

where  Tn  (0)  =  intensity  at  point  Q  due  to  field  propagating  from  pinhole  Pi(4.15) 

and  T 12  =  cross-correlation  of  light  from  P±  and  P2.  (4.16) 

If  the  separation  of  the  pinholes  is  very  small,  then  ri2(rV:i)  ~  Vrn  (0)  r22  (0),  and 
&12  (r2)ri )  ~  0.  Making  the  substitution  for  Ti2  and  replacing  Tn  (0)  and  T22  (0)  with 
intensity  variables  I\  and  I2  produces  the  following  form: 


IQ{ri,  r2)  =  K\h  +  Kp2  +  2K1K2^IJ2coS  {A  <t>(r2,n)}  .  (4.17) 
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Comparing  this  result  to  the  equation  for  temporal  interference  in  (4.12 ),  the  only  differences 
lie  in  the  relative  intensities  and  whether  the  phase  shift  was  brought  about  by  relative  path 
delay  or  a  spatial  shift.  Assuming  that  K\  =  K2  =  1  and  I±  =  I2  =  Iq  both  expressions 
reduce  to  the  simplified  interference  equation: 


I  =  2/q  (1  +  cos{A(/>}) . 


(4.18) 


Interferometric  wavefront  sensors  introduce  a  known  shift  or  delay  in  a  reference  wave- 
front  in  one  path  and  interfere  it  with  the  unaltered  beam  to  generate  interference  patterns. 


From  the  unique  characteristics  of  the  interference  pattern  and  the  relationship  in  (4.18) 


above,  the  wavefront  sensor  will  attempt  to  measure  cj),  A <f>,  or  some  set  of  parameters  ap¬ 
proximating  cj).  The  degree  of  spatial  and  temporal  coherence  in  a  source  will  determine 


the  amount  of  calibration  and  precision  necessary  to  ensure  that  (4.18)  applies  and  create 
measurable  interference  patterns. 


Lateral  Shear  Interferometry.  The  Lateral  Shearing  Interferometer  (LSI)  is  designed 
to  provide  a  measure  of  the  average  wavefront  slope.  The  LSI  interferes  a  delayed  and  shifted 
portion  of  a  collimated  beam  with  the  original  beam.  The  degree  of  shift  between  the  two 
beams  is  called  the  shear  distance.  The  shear  distance  is  small  such  that  a  significant  portion 
of  the  two  beams  overlap.  An  interference  pattern  is  visible  within  the  region  of  overlap. 
Analysis  of  the  resulting  fringe  pattern  provides  the  differential  phase,  A(f>/  As,  where  As 
represents  the  shear  distance.  A  simple  version  of  the  LSI  can  be  created  from  a  parallel 

The  plate  will  produce  two  reflections  separated  and 


plate  as  shown  in  Figure  4.2 


delayed.  More  sophisticated  shearing  interferometers  use  diffraction  gratings  as  indicated 


in  Figure  4.3  [26].  If  a  diffraction  grating  is  used,  care  must  be  taken  to  avoid  overlapping 
multiple  orders  of  diffraction.  The  spacing  of  grating  lines  are  typically  designed  such 
that  the  +1  and  -1  order  beams  are  tangent.  LSIs  are  only  capable  of  providing  slope 
information  in  the  shear  direction.  LSIs  employed  in  wavefront  sensing  create  shear  in  two 
directions  to  provide  two-dimensional  slope  measurement.  This  requires  a  beamsplitter 
or  cross-grating  in  the  case  of  the  diffraction  grating  style  sensor.  Splitting  the  beam  by 


polarization  is  also  a  viable  technique  m-  The  diagrams  in  Figure  |4.4[  provide  results 
from  a  Matlab  simulation  of  lateral  shear.  LSIs  are  used  extensively  in  optical  system 
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Figure  4.2  Diagram  of  a  parallel  plate  lateral  shear  interferometer  |26j. 


Incoming  wavefront 


Grating 


Interference  pattern 


Figure  4.3  Diagram  of  a  diffraction  grating  lateral  shearing  interferometer 
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testing  and  have  been  successfully  implemented  in  adaptive  optics  systems  [28j.  Shearing 
interferometers  can  be  used  to  measure  the  OTF  of  an  optical  system  [22]  -  The  OTF  is 
measured  by  continuously  varying  the  level  of  shear  and  path  delay.  Fast  methods  exist 
for  recording  both  the  OTF  and  the  Modulation  Transfer  Function  (MTF),  the  magnitude 
of  the  OTF  [ED] ,  [31] . 

Software  wavefront  reconstruction  algorithms  are  used  to  recover  the  original  phase- 
front  from  the  interference  patterns.  The  measured  slope  information  is  inherently  modulo 
27 r  which  makes  the  process  of  integration  nontrivial.  This  problem  is  evident  in  the  basic 
equation  for  interference: 


I  = 

2/0  (1  +  cos{A0}) , 

(4.19) 

A  <j>  = 

cos"  (k  ~  ')  ■ 

(4.20) 

A  cf  <E 

[— 7T,  7T). 

(4.21) 

The  modulo  27r  phase  slope  is  referred  to  as  wrapped  phase  information.  Recovery  methods 
must  unwrap  the  phase  and  integrate  to  recreate  the  original  wavefront.  Sampling  is  critical 
in  the  phase  unwrapping  problem  because  phase  changes  greater  than  7r  radians  between  ad¬ 
jacent  samples  can  seldom  be  resolved.  Two  general  categories  of  wavefront  reconstruction 
include  least  squares  curve  fitting  methods  and  path  following  phase  unwrapping  algorithms 

EH,  ESI- 


Point  Diffraction  Interferometry.  Rather  than  measuring  the  average  wavefront 
slope,  it  is  possible  to  measure  the  wavefront  directly  by  comparing  it  with  an  unaberrated 
reference  wavefront.  Interferometers  that  use  a  reference  wavefront  generated  from  the 
input  wavefront  are  commonly  called  self-referencing  interferometers  (SRIs).  Of  particular 
importance  among  SRIs,  is  the  Point  Diffraction  Interferometer  (PDI).  The  PDI  creates  a 
reference  wavefront  from  pinhole  diffraction  |25] .  Smartt  proposed  a  simple  construction 
where  the  PDI  focuses  the  input  wavefront  onto  a  pinhole  etched  out  of  a  semi-transparent 
material  [31].  The  pinhole  is  sufficiently  small  (on  the  order  of  a  few  //m)  such  that  it 
spatially  filters  out  all  of  the  incoming  wavefront  aberrations  and  passes  only  a  smooth 
spherical  reference  wave.  The  remainder  of  the  input  wavefront  is  attenuated,  but  not 
spatially  filtered,  by  the  semi-transparent  material.  The  wavefront  must  be  attenuated 
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Figure  4.4  Simulated  examples  of  lateral  shear  interference  patterns.  Vertical  (top)  and 
horizontal  (bottom)  shear  directions  for  beams  with  defocus  (left),  astigmatism 
(center),  and  coma  (right). 
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such  that  it  has  amplitude  on  the  order  of  the  reference  wave  in  order  to  create  visible 
interference  fringes  in  the  image  plane.  Visibility  of  fringes  is  directly  related  to  the 
modulation  of  the  sinusoidal  pattern  |2] : 

Visibility  =  ^max  ~  ^min .  (4.22) 

imax  "P  dmin 


PDI  Plate 


Focusing  Lens 


Imaging  Lens 


Figure  4.5  Diagram  of  a  common  path  PDI  [261. 


As  with  the  LSI,  there  are  many  variations  on  the  PDI.  In  particular,  PDIs  may 
have  multiple  beam  paths  or  a  common  beam  path.  Figure  |4.5|  provides  an  example 
diagram  for  a  common  path  PDI  [[26] .  Multiple  beam  paths  provide  for  both  spatial 
and  temporal  dithering  of  the  reference  wavefront  for  improved  measurement  precision. 
Common  path  PDIs,  on  the  other  hand,  require  less  calibration  and  offer  increased  tolerance 
to  vibrations  and  harsh  operating  environments  |35] .  PDIs  may  incorporate  polarizers  or 
birefringent  materials  to  create  orthogonal  polarization  between  the  reference  beam  and 
the  input  beam.  Orthogonal  polarization  between  the  input  beam  and  the  reference  beam 


provides  for  optimal  fringe  visibility  m-  Figure  |4~6|  provides  simulated  PDI  interference 
patterns  for  the  cases  of  independent  defocus,  astigmatism  and  coma. 


PDIs,  and  more  generally,  self-referencing  interferometers,  have  proven  successful  in 
adaptive  optics  applications  [251  •  While  slightly  more  complicated  than  shearing  interfer¬ 
ometers  and  other  wavefront  sensor  designs,  PDIs  offer  several  benefits.  Since  PDIs  measure 
the  wavefront  phase  directly,  there  is  less  error  introduced  in  the  wavefront  reconstruction 
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Figure  4.6  Example  PDI  interference  patterns.  Left:  wavefront  with  defocus  aberration. 
Center:  wavefront  with  astigmatism.  Right:  wavefront  with  coma. 


process.  PDIs  are  also  less  sensitive  to  field  amplitude  noise,  sometimes  called  scintillation, 


than  the  LSI  and  the  Hartmann  sensor  to  be  discussed  in  Section  4.3  m-  Common  path 
PDIs  are  more  effective  for  sources  with  low  temporal  coherence  than  shearing  interferom¬ 
eters  because  the  reference  wavefront  and  the  aberrated  wavefront  are  not  spatially  shifted 

m- 


For  all  their  benefits,  PDIs  create  technical  challenges  as  well.  Since  the  reference 
wavefront  is  spatially  filtered,  the  input  beam  must  have  enough  power  to  provide  sufficient 
signal  to  noise  ratio  (SNR)  in  the  interference  pattern.  In  the  case  of  narrow-band,  coherent 
inputs,  Rhoadarmer  et.  al.  describe  a  fiber  laser  amplification  method  which  dramatically 
improves  SNR  |36] .  The  presence  of  large  tilt  terms  in  the  input  wavefront  can  shift  the 
focus  of  the  input  beam  away  from  the  spatial  filter.  For  this  reason,  inputs  with  large 
tilt  variance  force  the  adaptive  optics  system  to  correct  for  tilt  prior  to  the  PDI  wavefront 
sensor,  or  to  somehow  incorporate  a  moving  pinhole  in  the  device  [38] .  Each  of  these 
challenges  requires  an  engineering  solution  which  brings  with  it  some  set  of  trade-offs  in 
complexity  and  efficiency. 


4-2  Phase  Retrieval  from  Intensity  Measurements 

The  problem  of  wavefront  sensing  in  Adaptive  Optics  is  a  single  instance  of  a  broader 
class  of  problems  in  electro-optics  often  referred  to  as  phase  retrieval.  The  need  for  phase 
retrieval  arises  in  many  other  applications  such  as  electron  microscopy,  x-ray  imaging  and 
single-sideband  communications  las],  m-  The  general  phase  retrieval  problem  can  be 
summarized  using  a  Fourier  domain  model.  Consider  the  complex  function  and  its  Fourier 
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transform: 


m 

=  1^(01  exP  {#(£)}, 

(4.23) 

/(x) 

=  l/(x)|  exp (j0(x)}  , 

OO 

(4.24) 

m 

=  j  /(x)exp{— j2vr£  ■  x}dx. 

(4.25) 

—  CO 


The  phase  retrieval  problem  in  optics  is  synonymous  with  the  problem  statement:  given 


|-F(£)|  and  |/(x)|,  find  and  </>(x).  As  shown  in  (2.188),  the  linear  systems  diffraction 
model  relates  the  object  and  image  domains  through  simple  Fourier  analysis  of  the  optical 
system.  Thus,  through  the  linear  optical  system  model  it  is  possible  to  translate  the  general 
phase  retrieval  problem  into  an  optical  wavefront  sensing  problem.  In  its  most  general  form, 
the  problem  of  phase  retrieval  has  infinitely  many  solutions  making  the  problem  ill-posed. 
However,  if  the  problem  is  constrained  by  making  certain  assumptions  about  /(x),  then 
the  infinite  set  of  solutions  can  become  a  limited  set  of  solutions,  possibly  even  a  unique 
solution  m-  Gerchberg  and  Saxton  devised  an  algorithm  to  solve  the  phase  retrieval 
problem  in  electron  microscopy  142,  •  Their  approach  employs  an  iterative  technique  which 
makes  constraining  adjustments  in  both  the  object  and  image  domains.  The  simplest 
set  of  constraints  forces  the  object  and  image  domain  amplitudes  to  match  the  measured 
values  at  each  iteration.  The  phase  is  often  seeded  with  a  random  process  to  begin  the 
first  iteration.  The  algorithm  continues  until  some  minimum  error  criteria  is  reached.  The 
block  diagram  in  Figure [4T] describes  the  Gerchberg-Saxton  (GS)  phase  retrieval  algorithm. 
This  technique  is  guaranteed  to  converge  in  the  mean  squared  sense.  However,  there  is  no 
such  guarantee  that  the  resulting  phase  solution  is  unique.  The  non-uniqueness  might  be 
tolerable  if  it  were  limited  to  an  added  constant  phase,  however,  the  ambiguity  also  includes 
a  possible  sign  change.  Although  limiting  the  set  of  solutions  from  infinitely  many  to  a  sign 
error  is  a  significant  step,  the  non- uniqueness  problem  limits  the  utility  of  the  algorithm  in 
wavefront  sensing. 


To  overcome  the  issue  of  non- uniqueness,  Misell  modified  the  GS  algorithm  to  incor¬ 
porate  information  from  two  imaging  paths  |40| .  By  creating  two  optical  paths  with  a 
known  phase  difference,  or  phase  diversity,  between  them,  the  sign  in  the  phase  aberration 
could  be  resolved.  One  simple  method  for  creating  such  a  diversity  is  to  purposefully  offset 
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Figure  4.7  Flow  diagram  of  the  general  Gerchberg-Saxton  algorithm. 


each  image  plane  thereby  introducing  a  known  amount  of  defocus.  A  diagram  for  Misell’s 


algorithm  using  focus  diversity  is  shown  in  Figure  4.8 


Fienup  proposed  a  modification  to  the  GS  algorithm  for  image  reconstruction 
The  modified  algorithm  uses  phase  diversity  much  like  the  Misell  approach,  however,  it  is 
designed  for  use  in  post  processing  of  images  where  only  the  image  modulus  is  known.  In  this 
case,  no  knowledge  is  assumed  about  the  pupil  amplitude  (the  object  is  unknown)  making 
the  algorithm  estimate  both  the  object  and  the  corrections  necessary  to  improve  image 
quality.  The  technique  was  designed  for  reconstruction  of  imagery  from  interferometric 
data.  The  key  to  Fienup’s  algorithm  is  that  there  are  two  characteristics  known  about 
the  object:  the  object  is  both  real  and  non-negative.  These  qualities  introduce  additional 
constraints  into  the  algorithm.  Fienup  also  proposed  a  method  to  increase  the  rate  of 
convergence  through  negative  feedback.  Fienup’s  input-output  version  of  the  GS  algorithm 


is  diagrammed  in  Figure  4.9  The  set  7  represents  all  points  where  the  estimated  object 
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Figure  4.8  Block  diagram  of  Misell’s  modified  GS  algorithm  |4D]- 


violates  the  real  and  non-negative  constraints.  Only  the  set  of  object  points  which  violate 
the  constraints  are  modified  after  each  iteration.  The  Fourier  domain  constraints  typically 
consist  of  the  measured  image  amplitude. 

Shortly  after  the  first  published  version  of  the  Gerchberg-Saxton  algorithm,  Gonsalves 
proposed  a  parameter  searching  algorithm  for  phase  retrieval  [35].  The  method  was  later 
improved  to  include  elements  of  phase  diversity  and  combined  wavefront  phase  and  object 
estimation  making  it  an  enticing  algorithm  for  use  in  a  wavefront  sensing  applications  [IB]. 
The  approximate  pupil  phase  is  parameterized  by  some  suitable  set  of  polynomials  such 
as  the  first  21  Zernike  polynomials.  Estimates  of  the  pupil  phase  are  generated  in  each 
iteration  of  the  algorithm.  Gonsalves’  algorithm  boasts  the  capabilities  of  both  Misell’s  and 
Fienup’s  modified  GS  algorithms.  The  method  is  based  on  a  mean  squared  error  estimator 
for  the  object  and  a  gradient  search  routine  to  minimize  error  between  the  observed  images 
and  the  current  iterations  estimated  images.  The  phase  diversity  is  defined  just  as  in 
Misell’s  algorithm:  any  arbitrary,  but  known,  phase  difference  between  two  image  planes. 
As  in  Fienup’s  image  reconstructor,  the  object  does  not  need  to  be  known  making  the 
algorithm  ideal  for  extended  source  objects.  Expressed  here  in  the  Fourier  domain,  the 
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Figure  4.9  Flow  diagram  of  Fienup’s  input-output  modification  of  the  GS  algorithm  for 
image  reconstruction  [53]. 


object  estimator,  0(£),  is  an  optimum  mean  squared  error  estimator  for  0(£)  based  the 
prior  knowledge  provided  in  the  observed  images,  Gu(£)  and  G 2(C),  and  the  pupil  phase 
estimates  P\  (£)  and  P2($)- 


“  AWi(0  +  4*(£)A(0 


The  mean  squared  error  metric  most  often  used  in  versions  of  Gonsalves’  algorithm  has 


Figure  4.10 


Block  diagram  of  Gonsalves’  parameter  searching  phase  retrieval  algorithm. 
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come  to  be  known  as  the  Gonsalves  metric  m-- 


E  = 


Gi(€)-o(€m(€)  +  g2(0-o(0^2(0  da 


(4.27) 


Simulated  results  of  Gonsalves’  algorithm  as  well  as  versions  of  the  GS  algorithm  are 
quite  remarkable.  This  is  perhaps  the  reason  that  these  algorithms  have  been  the  subject 
of  much  research  over  the  past  30  years.  Working  from  nothing  more  than  a  pair  of  dis¬ 
torted  images  and  knowledge  of  the  optical  system,  these  iterative  techniques  are  capable  of 
achieving  near  diffraction  limited  performance.  Unfortunately,  phase  retrieval  techniques 
require  significant  computational  power.  Each  version  of  the  phase  diversity  algorithm  is 
guaranteed  to  converge,  however,  the  number  of  iterations  and  computational  time  required 
for  convergence  can  be  too  great  to  be  accomplished  at  the  frequency  of  atmospheric  dy¬ 
namics.  Until  sufficient  computational  power  becomes  available,  these  algorithms  continue 
to  provide  only  an  effective  means  of  post  processing  recorded  image  data. 


4-3  The  Hartmann  Wavefront  Sensor 

The  use  of  interferometric  techniques  in  the  manufacture  of  optics  becomes  increas¬ 
ingly  difficult  as  the  size  of  the  optics  increase.  For  this  reason,  a  special  test  was  developed 
for  manufacture  of  large  telescope  optics  called  the  Hartmann  test.  The  test  consists  of 
placing  a  mask  of  many  small  subapertures  over  the  optic  under  test  and  measuring  the  focal 
length  of  each  subaperture.  A  diagram  of  the  Hartmann  test  is  shown  in  Figure  |4. 11  The 
advantage  of  the  Hartmann  test  over  interferometric  processes  is  that  the  Hartmann  test 
can  be  conducted  with  relatively  broadband  sources  meaning  that  it  circumvents  much  of 
the  spatial  and  temporal  coherence  constraints  emphasized  in  the  section  on  interferometric 
wavefront  sensing.  The  Hartmann  wavefront  sensor  is  an  adaptation  of  the  Hartmann  test. 
In  the  Hartmann  wavefront  sensor,  the  mask  of  subapertures  is  replaced  by  a  small  array  of 
lenslets.  The  lenslet  contribution  is  attributed  to  Dr.  Roland  Shack  and,  for  that  reason, 
the  wavefront  sensor  is  often  referred  to  as  the  Shack-Hartmann  wavefront  sensor.  Figure 


4.12  provides  a  basic  diagram  of  the  Hartmann  wavefront  sensor. 

The  Hartmann  wavefront  sensor  provides  a  measure  of  the  wavefront  slope  much  like 
the  shearing  interferometer.  Each  lenslet  focuses  onto  its  respective  region  of  a  CCD.  The 
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Figure  4.11  Schematic  of  a  Hartmann  test  setup  and  example  image  plane  output. 


centroid  of  each  subaperture  image  shifts  proportional  to  the  coefficients  of  Zernike  tilt.  By 
locating  the  centroid  of  each  subaperture  image,  the  sensor  provides  an  estimate  of  local 
wavefront  slope.  If  each  lenslet  image  is  approximated  by  a  Gaussian  shape,  then  it  can  be 
shown  that  the  centroid  calculation  is  a  maximum  likelihood  estimator  for  tilt.  Recall  the 
Poisson  conditional  density  for  a  detected  image  D  given  some  noiseless  image  I: 


Pd|a(D|A)  =  n 

ugiS 


I[u;  A]DIU]  exp(-I[u;  A]) 

DR! 


(4.28) 


If  I  depends  only  on  contributions  from  Zernikes  2  and  3  then  the  conditional  density 
becomes: 


Pd|a2,a3  (D|H.2,  A3)  1 1 

u£cS 


I[u;H2,  H3]dM  exp(— I[u;H2,  A3]) 
D[u]! 


(4.29) 


In  the  paraxial  region  of  the  image  plane,  there  exists  a  linear  relationship  between  the 
Zernike  tilt  coefficients  A2  and  A3  and  independent  pixel  shifts  in  the  £  and  t]  directions 
respectively.  An  incoming  ray  making  an  angle  with  the  optical  axis  will  intercept  the 
image  plane  offset  from  the  optical  axis  proportional  to  the  image  distance.  Using  this 
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Figure  4.12  Diagram  of  a  Hartmann  sensor. 


geometric  analysis,  Zernike  tilt  relates  to  the  offset  distance  in  the  image  plane  by  similar 
triangles  as  shown  in  Figure  4.13|  Using  this  geometric  construct,  the  expression  for  A2  in 


di,  =  AnAjf 


Figure  4.13  Ray  optics  diagram  demonstrates  the  relationship  between  a  single  pixel  shift 
in  the  image  plane  and  the  Zernike  tilt  parameter. 

terms  of  an  arbitrary  pixel  shift  Art  is  derived  below.  Begin  with  the  expression  for  sc-tilt : 


A2Z2  (, x ,  9)  =  A2 2  (  )  cos  (9) 
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Substitute  in  the  point  along  the  x  axis  at  the  edge  of  the  aperture,  (x  =  Rp,  6  =  0): 


A2Z2  ( Rp ,  0)  =  A22. 


(4.30) 


Now  form  an  expression  equating  the  ratios  of  the  sides  of  the  similar  triangles  from  Figure 


4.13  and  solve  for  A2: 


A2Z2  (Rp,  0) 
Rp 


A2 


Si  ’ 

RP(JA 

2  Si 


(4.31) 

(4.32) 


The  distance  d £  has  units  meters.  The  index  variable  A u  which  has  units  in  pixels  is  more 
useful  in  the  derivation  to  follow.  Recall  that  the  width  of  a  pixel  in  the  image  plane  given 


given  by  A£  in  (3.19).  To  relate  the  coefficient  A2  to  pixel  shifts,  substite:  =  AuA£, 


into  the  expression  for  A2  in  (4.32)  and  solve  for  the  pixel  shift: 


A2 
A  u 


RpAuA^ 

2  Si 

2A2Si 

Rp  A£ 


Similarly,  the  expression  for  Av  is  given: 


An 


2A3Si 

RpAr] 


(4.33) 

(4.34) 


(4.35) 


To  compact  the  notation,  denote  the  pixel  shifts  as  functions  of  the  Zernike  parameter  and 
combine  them  into  a  vector  format:  Au  (t12,j43)  =  (Au  (A2) ,  Av  (A3)) .  Now  return  to 


(4.29)  and  substitute  the  shift  function,  Au  (A2,  A3),  for  the  expected  image  arguments  A2 
and  A3: 


,-TH  A  4  ^  _  rr  I[u;Au(A2,  A3)]dIu]  exp(-I[u;Au(A2,  A3)]) 
Pd|a2,a3(D|>i2,  ^3)  |  D(ul! 


ug5 


(4.36) 


Note  that  this  substitution  does  not  change  the  expression  except  to  reveal  to  the  reader 
the  direct  relationship  between  the  tilt  parameters  and  the  xy  shift  of  the  expected  image  in 
the  image  plane.  Now  make  the  approximation  that  the  expected  image,  I,  has  a  Gaussian 
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spatial  distribution.  This  approximation  is  based  on  the  fact  that  the  central  lobe  of  the 
Fraunhofer  diffraction  pattern  for  a  diffraction  limited  optical  system  resembles  the  bell 
shaped  curve  of  a  Gaussian  distribution.  Approximate  the  noiseless  image  as: 


I[u— Au  (A2,  A3)] 
where  K 


and  a2 


K  (  |u— Au(A2,  A3)|2 

Wexp( - 2^ - 


variance  parameter  based  on 
the  system  //#. 


(4.37) 

(4.38) 

(4.39) 


The  ML  estimator  log  likelihood  expression  using  the  input  conditional  pdf  in  (4.36)  is  given 
by: 

Lmi  (A2,  A3 )  =  ^2  D[u]  In  {I[u;Au  (A2,  A3)]}  -  I[u;Au  (A2,  A3)].  (4.40) 

ugS 

Substituting  the  Gaussian  form  of  I  into  the  log  likelihood  expression: 


Lmi(A2,A3)  =  ^D[u]ln  j^2  j -D[u] 

ii  cz.G  v  ) 


u-Au  {A2,A3)Y 


u£cS 

K 


2n2 


27TCJ2 


exp 


|u— Au  (A2,A3)  Is 
2cj2 


(4.41) 


Solving  for  the  ML  estimator  requires  maximizing  the  log  likelihood  expression  over  the 
range  of  the  shift  function,  Au,  by  differentiating  with  respect  to  each  parameter  and 
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setting  the  result  equal  to  zero. 


Lmi  (A2,  A3) 


d 

dA2 


Lml  (A2 ,  A3) 


d 

dl~3 


Lml  {A2l  A3) 


u-Au  (A2))2  +  (u-Ar  (A3))2 


E  D[«,«]ln{^}-D[u,„]i  2a2 

,  C  ^  A 

(u—Au  (A2))2  1  \  (v-Av  (A3))2 


[u,v]e5 

K 


2tto2 


exp 


J2 

[u,v]&S 


2a 2  J 

u—Au  (A2)  2s. 


exp 


2a2 


a 2  RpAi 

u—Au(A2)  2  Si  K 


exp 


a2  RpA£  2ixa2 
\  (u-Au(A2))2  \ 


exp 


Y 

[u,^]ScS 


2a2 

J  \ 

v-A  v(A3)  2  Si 


(v-Av  (A3))2 
2a2 


a2  RpAij 
v—Av  (A3)  2 Si  K 


exp 


a2  RpAij  2i ra2 

(  (u—Au  (A2))2  ^ 
2a2 


exp 


(v-Av  (A3))" 
2cr2 


(4.42) 


(4.43) 


(4.44) 


Assuming  that  the  shift  function  is  distributed  Gaussian,  the  second  term  in  each  derivative 
represents  a  constant  multiplying  the  central  moment  of  the  shift  function: 

Y  (u—Au  (A2))  exp  /  —  A2^2^  }=°-  (4-45) 
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Under  the  atmospheric  model,  the  central  moment  for  all  modes  is  zero, 
solving  for  the  parameter  estimates  straightforward: 


This  makes 


E  dk 


,  u—Au  (. A2 )  dAu  (A2) 


[u,u]g5 


<7 


dA2 


E  T>[u,v](u-Au(A2)) 

[u,m]GS 

E  D  [u,v\u 


therefore  A umi  = 


['Uj'uJeS 


E  D[u’' 


[«,)/]  GcS 


o 


dA, 


therefore  Avmi  = 


E 

[«,ti]ScS 


0, 

0, 


E  dm  ’ 

[u,m]gS 

v—Av  (A^)  dAv  (A3) 


E  D[rt,r>]  (v-Av  (A3)) 

u,v]£S 

E 

M,^]ScS 


0, 

0, 


(4.46) 

(4.47) 

(4.48) 

(4.49) 

(4.50) 

(4.51) 


Thus,  the  maximum  likelihood  estimator  for  each  shift  parameter  is  a  centroid  calculation. 
The  estimators  for  Am  and  Av  differ  from  the  estimators  for  the  Zernike  coefficients  by  a 
constant  multiplier: 


“2  ml 

Rp A£a, 

=  „  A  Umi 

2  Si 

UZml 

RpAr]  A 

=  „  A  Vml 

2  Si 

(4.52) 

(4.53) 


Each  lenslet  in  the  Hartmann  sensor  provides  the  same  information  as  a  single  pixel  in  the 
shearing  interferometer  [3] .  The  Hartmann  sensor  is  preferred  over  shearing  interferometers 
in  low  light  and  low  signal  to  noise  ratio  viewing  [48].  Although  its  performance  is  often 
better  than  that  of  shearing  interferometers,  the  centroid  calculation  in  each  subaperture 
of  the  Hartmann  sensor  is  still  prone  to  error  in  low  SNR  environments.  Additionally,  the 
Gaussian  distribution  in  the  image  may  not  be  an  accurate  assumption  when  the  object  is 
an  extended  source. 
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4-4  Wavefront  Sensing  Using  Image  Projections 


The  Hartmann  wavefront  sensor  can  be  modified  to  operate  on  image  projections.  The 
short  wavelength  adaptive  techniques  (SWAT)  wavefront  sensor  system  at  MIT  Lincoln  Labs 
uses  this  technique  to  accelerate  the  tilt  estimation  process  j49|.  The  basic  configuration 
of  the  projection  based  sensor  requires  two  cameras  providing  orthogonal  image  projections 


as  described  in  Figure  4.14  This  configuration  divides  the  amount  of  light  into  two  paths 
Aberrated  Wavefront  50/50  Beamsplitter 


£  pixels 
along  y 


y-axis  Projection 

Figure  4.14  Figure  demonstrates  a  conceptual  example  of  two  orthogonal  projections  pro¬ 
vided  by  cameras  in  a  projection  correlating  tilt  sensor. 


and  reduces  the  amount  of  signal  available  in  each  camera  output.  Therefore,  what  the 
projection  based  wavefront  sensor  loses  in  signal  to  noise  ratio,  it  must  make  up  for  in 
speed,  efficiency  and  reduction  in  read  noise.  In  the  SWAT  wavefront  sensor,  projections 
are  formed  on  the  CCD  outputting  only  N  locations  for  an  N  x  N  image  space.  The 
background  section  on  image  vector  projections  underlined  the  importance  of  vector  output 
from  the  standpoint  of  CCD  read  out  time  and  CCD  read  noise.  Thus,  the  first  design 
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improvement  over  a  simple  two-dimensional  centroid  sensor  is  to  simply  compute  part  of 
the  centroid  operation  directly  on  the  CCD  prior  to  read  out. 

The  centroiding  sensor  may  be  improved  further.  To  improve  the  quality  of  sub¬ 
aperture  tilt  estimates  in  low  signal  to  noise  environments  and  provide  a  more  accurate 
tilt  estimator  for  extended  objects,  the  centroiding  calculation  should  be  replaced  by  an 
image  correlating  technique  m •  The  cross-correlating  tilt  estimator  provides  for  greater 
noise  rejection  and  is  better  suited  for  extended  source  objects  m-  The  cross-correlation 
calculation  requires  more  compuational  power  than  a  simple  centroid  calculation.  However, 
the  image  projection  reduces  the  calculation  time  required  for  correlation  style  image  reg¬ 
istration  making  the  calculation  feasible.  The  following  derivation  outlines  the  efforts  of 

Cain  et.  al.  1511- 

Recall  the  general  projection  operator  from  Section  [3. 3[  Using  this  notation,  the  two 
projections  suggested  by  Cain  may  be  expressed: 


sv  (Dp0) 

(4.54) 

and  s v  (D2j90)  , 

(4.55) 

where  s  =  {(1,  Nw)}  ■ 

(4.56) 

The  window  size,  N\\r,  has  been  left  arbitrary.  Consider,  once  again,  the  tilt  only  conditional 


pdf  given  in  (4.29).  Substituting  the  image  projection  into  (4.29)  and  separating  the 


conditional  pdf  into  two  independent  tilt  expressions  gives: 


Psv(di,o)|a2(sv  (Dl,o)  1^2)  - 


sv;  (Ii,o  [H2])sVi(Dl'o[ul)  exp  (svi  (-Ii,o  [H2])) 
1=1 


s v  1  ({Di)0[u]!  :  u  £  Si}) 


(4.57) 


Nw 


Psv(d2i9o)|a3(sV  (D2j90)  |^3)  -  JJ 


SV*  (I2,90  [^3])  sVi  (E>2'9o)  exp  (SV;  (-I2i90  [H3])) 


1=1 


3V1  ({D2i90 [u] !  :  u  €  S2}) 


(4.58) 
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AtbriZ  —  0 

while  La 2  (Aumi^  <  {Ta2  (Alim;  2)  ,  Ta2  (A'U.m;  1)  >  Ta2  (Aw.^  T  1)  ,  La 2  (A{tm;  T  2)} 

if  La2  T  1)  T  La 2  (Aum;  +  2)  >  La2(Aumi  1)  “I-  La2(Aumi  2) 

AtZ mZ  —  Alt mZ  “1“  1 
else 

Aum;  =  ^f^mZ  —  1 
end 
end 


Table  4.1  Pseudo  code  for  the  1-D  linear  search  algorithm. 


From  (4.57)  and  (4.58),  the  log  likelihood  functions  are  given: 


n» 


La2{A2) 

=  22  sVi  (Db°)  ln  {  svZ  (ll,0  [^2])}  -  s VZ  (Ii,o  [^2])  , 

Z=1 

Nw 

(4.59) 

La3{A3) 

=  22  sWl  (D2,9o)  bl{sV;  (I2,90  [^3])}  “  s Vz  (12,90  [^2])  • 

(4.60) 

Z=1 


Note  that  the  reference  image  projections,  Ii  o,  and  12,90  no  longer  have  a  specified  form 
such  as  the  Gaussian  approximation  given  in  the  centroid  estimation  case.  The  reference 
image  projections  may  be  formed  from  a  known  object  or  from  an  ensemble  average  of 
images  in  the  case  where  the  object  is  unknown.  The  reference  images  can  be  posed  as 
functions  of  the  arbitrary  pixel  shift  variables  from  the  previous  section: 


Nw 

La2{Au)  =  22  sVz  (Dli0)ln{sv*  (I1)0  [Am])}  -  svt  (Ilj0[Au]),  (4.61) 

Z=1 
Nw 

La3(Av)  =  22  sVl  (D2,9o)  In  {  sVj  (12,90  [Aw])}  -  sVi  (I2,90  [Aw]) .  (4.62) 

Z=1 


Expressions  (4.59)  and  (4.60)  are  essentially  correlation  functions.  Maximizing  these 
correlation  expressions  over  a  carefully  selected  region  of  the  shift  parameters,  An  and  Aw, 
is  the  job  of  the  correlating  tilt  wavefront  sensor.  Cain  suggests  a  linear  search  method  for 
maximizing  the  correlation  [Tj.  Examining  the  case  for  A u,  update  the  current  estimate 


A umi  via  the  algorithm  in  Table  4.1  The  diagram  in  Figure  4.15  describes  the  search 
algorithm.  The  granularity  of  this  approach  is  limited  by  the  pixel  size  in  the  detector, 
A£.  The  shift  parameters  can  be  estimated  to  an  accuracy  finer  than  a  pixel  dimension  by 
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Figure  4.15  Diagram  for  the  ID  log  likelihood  maximization  algorithm. 

interpolating  the  reference  image  vector,  v(Iqo)  [lj.  A  linearly  interpolated  vector  v(Igo) 
is  formed  as  follows: 

v  (Igof'u  —  An])  =  (1  —  An  +  floor  (An))  v  (Ii,o[^  —  floor(Au)])  + 

floor  (An)  v  (Ii,o[rt  —  (floor  (An)  +  1)]) ,  (4.63) 

where  An  is  the  arbitrary  shift  parameter,  not  restricted  to  integer  increments. 

4-5  Summary 

This  chapter  reviewed  many  existing  techniques  for  detecting  wavefront  phase  from 
intensity  measurements.  The  discussion  began  with  interferometric  techniques.  The  Lat¬ 
eral  Shearing  interferometer  provides  a  measurement  of  the  wavefront  phase  slope  and  the 
Self-Referencing  Interferometer  provides  a  measurement  of  the  wavefront  phase.  Interfer¬ 
ometric  techniques  have  been  proven  to  be  effective  in  adaptive  optics  systems.  Phase 
retrieval  methods  were  also  discussed.  These  methods  are  predominantly  employed  in  of¬ 
fline  analysis  due  to  the  computational  complexity  involved.  The  Hartmann  sensor  is  an 
adaptation  of  the  Hartmann  test  used  for  measuring  the  imperfections  in  large  optics.  The 
Hartmann  wavefront  sensor  is  very  common  due  to  its  speed  and  effectiveness  for  measuring 
localized  wavefront  tilt.  A  short  derivation  demonstrated  that  the  Hartmann  wavefront 
sensor  centroid  measurement  is  a  closed  form  solution  to  the  maximum  likelihood  image 
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shift  estimator.  The  closed  form  expression  for  the  ML  shift  estimator  comes  at  the  ex¬ 
pense  of  assuming  a  Gaussian  distributed  image  intensity.  The  SWAT  sensor  modifies  the 
Hartmann  wavefront  sensor  to  use  image  projections.  The  image  projection  significantly 
reduces  read  out  noise  and  CCD  read  time.  Cain’s  projection  correlating  wavefront  sensor 
makes  use  of  image  projections  as  well.  The  projection  correlating  tilt  sensor  uses  a  linear 
search  method  for  maximizing  the  likelihood  expression  rather  a  closed  form  expression. 
Cain’s  simulation  of  the  projection  correlating  tilt  estimator  shows  that  its  performance  is 
better  than  that  of  the  centroiding  tilt  estimator  under  low  light  conditions.  The  correlating 
technique  which  provides  superior  tilt  estimation  performance  can  be  adapted  to  estimate 
higher  order  Zernike  terms,  specifically  the  seven  Zernikes:  Z 4  through  Z 10.  The  remain¬ 
der  of  this  dissertation  describes  two  parameter  estimating  wavefront  sensors  designed  to 
estimate  and  correct  for  both  the  tilt  polynomials:  Z2  and  Z3,  and  higher  order  Zernikes 
up  to  Z 10. 
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5.  The  Z2- 4  Wavefront  Sensor 

This  chapter  details  the  first  of  two  wavefront  curvature  sensors.  Recall  that  a  curvature 
sensor  is  a  wavefront  sensor  designed  to  detect  some  limited  set  of  higher  order  aberrations 
along  with  x-tilt  and  y-tilt.  The  Z2-4  wavefront  sensor  measures  Zernike  coefficients  02 
through  04  from  point  source  image  vectors.  The  Z2-4  wavefront  sensor  design  builds 
on  the  projection  correlating  maximum  likelihood  (ML)  tilt  estimator  pQ.  This  curvature 
sensor  differs  from  the  tilt  sensor  in  several  ways.  The  first,  and  perhaps  the  most  crucial, 
difference  is  the  assumption  that  the  sensor  will  be  configured  to  image  a  known  beacon 
object.  Second,  the  curvature  sensor  estimates  defocus  error  within  each  subaperture.  Also, 
the  curvature  sensor  hardware  requires  a  defocus  diversity  between  the  pair  of  CCD  arrays. 
The  estimation  algorithm,  which  I  will  refer  to  as  the  Z2-4  estimator,  is  based  on  a  maximum 
a  posteriori  (MAP)  estimator  versus  a  maximum  likelihood  (ML)  estimator.  Additionally, 
the  expected  image  lookup  has  been  expanded  to  account  for  parameters  beyond  tilt  and  the 
expected  images  take  advantage  of  the  known  object  assumption.  Finally,  the  likelihood 
maximization  approach  has  been  updated  to  increase  speed  and  efficiency.  While  not  as 
effective  at  registering  arbitrary  images  due  to  the  beacon  object  assumption,  the  curvature 
sensor  presented  here  outperforms  the  ML  and  centroid  techniques  when  simulated  using  a 
point  source  input.  Below,  this  chapter  provides  the  details  of  the  Z2-4  curvature  sensor 
which  include:  a  description  of  the  hardware  considerations,  a  derivation  of  the  projection 
based  Z2-4  estimator  algorithm,  techniques  for  fast  and  efficient  likelihood  maximization, 
and  a  summary  of  key  sensor  design  variables. 

5.1  Sensor  Hardware 

The  focus  of  this  research  is  on  the  design  of  the  sensor  estimation  algorithm  rather 
than  sensor  hardware  design.  In  keeping  with  this  theme,  the  purpose  of  describing  the 
hardware  configuration  will  be  limited  to  identifying  key  design  variables  and  how  they 
affect  the  estimation  algorithm.  The  sensor  hardware  can  be  broken  down  into  three  main 
components:  an  array  of  subapertures,  a  beamsplitter,  and  a  pair  of  photon  counting  CCDs. 
This  configuration  is  not  unique  and,  in  fact,  includes  the  same  combination  of  components 
discussed  in  the  review  of  Hartmann  sensors  and  phase  diversity  techniques.  This  section 
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will  review  these  three  components  and  highlight  any  assumptions  or  requirements  specific 
to  the  ^2_4  estimator  algorithm. 

The  key  design  variables  associated  with  the  subaperture  array  are  the  diameter  and 
focal  length  of  the  subapertures.  The  wavefront  sensor  will  be  constructed  from  an  array 
of  subapertures.  Assume  that  the  diameters  of  all  individual  subapertures  as  well  as  their 
focal  lengths  are  identical.  Furthermore,  assume  that  the  pixel  size  in  the  CCDs  is  chosen 
according  to  the  Nyquist  criteria  which  is  adjusted  for  //#.  Under  this  assumption,  the 
focal  length  becomes  transparent  to  the  estimator  algorithm  design.  Therefore,  assume  that 
the  design  of  the  subaperture  array  only  impacts  the  estimation  algorithm  via  the  ratio  of 
the  fixed  subaperture  diameter  to  the  changing  operating  environment  variables:  r q,  Lq,  Iq. 
Since  each  of  the  atmospheric  variables  are  estimated  parameters  provided  to  the  sensor 
algorithm,  it  will  be  important  to  evaluate  the  performance  of  the  estimator  algorithm  over 
a  range  of  these  values  and  to  evaluate  the  sensitivity  of  that  performance  to  errors  in  each 
estimate. 

The  beamsplitting  device  allows  the  sensor  to  focus  the  subaperture  array  onto  two 
image  planes.  Lee  et.  al.  demonstrated  that  the  ideal  configuration  for  the  beamsplitting 
device  is  to  provide  equal  power  in  both  imaging  paths  in  a  phase  retrieval  system  [52]. 
The  same  performance  characteristic  holds  for  the  projection  based  algorithm.  Therefore, 
as  a  conservative  figure,  I  will  assume  that  the  beamsplitter  used  is  a  50/50  beamsplitter 
with  95%  efficiency.  The  efficiency  factor  is  crucial  when  comparing  the  Z2-4  curvature 
sensor  to  other  sensors  that  do  not  require  beamsplitting. 

The  key  design  assumptions  associated  with  the  CCD  arrays  are  the  Nyquist  sampled 
pixel  size  discussed  previously,  the  ability  to  produce  image  projections,  and  their  placement 
relative  to  the  geometric  focus.  The  first  two  considerations  are  straightforward,  which 
leaves  the  variable  of  CCD  placement  open  for  trade  study.  Ambiguity  in  the  effects 
of  higher  order  Zernikes  on  intensity  measurements  can  be  reduced  by  applying  a  known 
phase  diversity  between  the  two  image  planes.  In  the  case  of  the  Z2-4  sensor,  the  phase 
diversity  is  necessary  because  the  defocus  parameter  exhibits  identical  effects  in  intensity 
whether  the  coefficient  is  positive  or  negative.  Misell  suggested  that  the  simplest  method 
for  adding  a  known  phase  diversity  is  to  introduce  a  defocus  error  by  purposefully  offsetting 
the  image  plane  [3D] .  Lee  et.  al.  showed  that  the  defocus  diversity  should  be  applied  equally 
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and  with  opposite  sign  in  each  imaging  path  when  using  two-dimensional  image  data  for 
phase  retrieval  [3IJ.  I  have  confirmed  that  the  same  defocus  diversity  configuration  is 
ideal  for  estimating  phase  from  image  projections.  For  this  reason,  the  sensor  defocus 
diversity  labeled,  ±<5a4 ,  will  be  expressed  as  an  absolute  value  in  units  of  radians,  where  it 
is  understood  that  the  defocus  will  be  applied  positive  in  one  CCD  plane  and  negative  in 
the  other.  The  ideal  choice  of  diversity  will  depend  largely  on  both  the  ratio  ^  and  the 
average  photon  count  per  subaperture  per  exposure,  I\. 


5. 2  Image  Projections 

The  ^2_4  sensor  image  projection  is  a  shifted  and  summed  set  of  pixels  from  the 


original  image  as  indicated  in  Figure  5.1  As  shown  in  Figure  |5.1[  each  CCD  has  an 
associated  angle  of  rotation.  Applying  a  rotation  to  a  CCD  allows  for  taking  projections 
in  different  directions.  For  convenience  of  projection  notation,  the  summation  is  always 
performed  across  the  Vi  direction.  Specifying  the  projection  direction  is  accomplished  by 
associating  with  each  CCD  a  rotational  reference,  di ,  relative  to  the  AO  reference  image 
axes.  For  a  description  of  the  variables  defining  the  aperture  and  image  plane,  recall  the 
notation  for  a  discrete  image  formed  within  the  linearized  optical  model: 


I  [uA£,  vArj]  a] 
I  [u,  v ;  a] 
I  [a] 


AxAy 


(A  Si 


VtFT{V  [nAx,  mAy,  a]} 


/  [uA£,  vAiy,  a] , 
{I[«,  v\  a]  :  u,  v  £  5} 


(5.1) 

(5.2) 

(5.3) 


As  an  example,  consider  estimating  tilt  using  two  orthogonal  projections  along  the  reference 
AO  x  and  y  axes  as  in  Cain’s  ML  estimator.  This  is  described  in  the  context  of  the  image 
projection  operator  by  establishing  two  CCDs  one  offset  by  an  angle  of  0°  and  another 
offset  by  90°.  The  4^2-4  sensor  uses  this  same  configuration.  As  such,  the  CCD  images  are 
denoted:  D^o  and  D2,go-  The  pair  of  projections  used  in  the  Z2-4  estimator  will  be  referred 
to  as  whole  plane  projections.  Whole  plane  projections  are  single  vectors  produced  by 
summing  along  the  entire  windowed  region  v  axis.  The  size  of  the  window  should  be  chosen 
to  provide  some  minimal  residual  error  and  acceptable  computation  time.  For  the  purpose 
of  this  description,  the  window  size  will  remain  variable.  Although  image  projections  are 
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Reference  AO  coordinate  frame:  [u,v\ 


Figure  5.1  Diagram  of  the  Z2-4  sensor’s  whole-plane  image  projection  operation  for  even 
and  odd  length  windows. 

read  out  of  the  CCDs  only  once  during  each  exposure,  the  projection  operations  used  in  the 
estimators:  a  2,  03  and  a  a  are  distinguished  in  the  notation  as  if  they  were  separate  vectors 
for  mathematical  convenience.  The  three  Z2-A  sensor  projection  operations  are: 


{(l,AW')}v  (Dl,o)  , 

(5.4) 

{(1,ATvk)}V  (D2,9o)  , 

(5.5) 

{(1  ,NW)}V  (Di,0,D2,90)  ■ 

(5.6) 
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5.3  Likelihood  Expressions 

The  sensor  hardware  provides  two  image  projections  from  each  subaperture.  Figure 
|5.2|diagrams  the  read  out  and  flow  of  two  image  projections  through  the  estimator  algorithm. 
Each  likelihood  expression  requires  four  inputs:  a  detected  image  projection,  a  reference 


Figure  5.2  Diagram  shows  the  flow  of  image  projections  through  the  Z2-4  estimation 
algorithm. 


image  projection,  a  set  of  atmospheric  parameter  estimates  and  an  estimate  of  the  current 
photon  level,  K.  Solid  lines  indicate  the  flow  of  real  time  detected  image  projections. 
Dashed  lines  indicate  information  used  for  reference  projections  which  are  computed  and 
stored  into  lookup  tables  during  sensor  calibration.  The  atmospheric  parameter  estimates 
and  K  do  not  need  to  be  updated  at  every  image  frame,  but  should  be  updated  as  often  as 
necessary  to  ensure  some  threshold  of  accuracy.  The  heavily  outlined  blocks  in  Figure  [572] 
indicate  locations  where  a  likelihood  expression  is  evaluated.  Recall  the  general  form  for 
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the  MAP  estimator: 


(5.7) 

A.=3.map 

The  likelihood  expression,  denoted  Lmap( A),  is  the  expression  to  be  maximized.  Extracting 
the  likelihood  expression  from  the  MAP  estimator  equation  and  tailoring  the  projection 
operator  specifically  for  the  Z2-4  estimator  yields: 


max 

/ 

V — >-/vv-/Vvtz  r  9-1 

Z^/=1  [svz  (Du)  +  a20]  x 

In  { svz  (Iu  [A])  +cr2ro}  - 

AAg1At 

A  | 

2 

V 

sv  1  (Iu  [A]) 

> 

NVNW 


(A) 

=  X!  [«l,%)}Vi  (Di,o,D2,9o)  +  Cro]  X 

l—  1 

ln{((i ,nw)}Vi  (li,o  [A]  , I2,90  [A])  +a2ro }  - 

,  r-n  r-n.  AA=  1Af 

{(l,Nw)}Vl  (ll,0  [A]  ,12)90  [A])  2 

(5.8) 

A 

=  the  infinite  set  of  atmospheric  parameters, 

(5.9) 

Aa 

=  the  parameter  covariance  matrix, 

(5.10) 

2 

® ro 

=  CCD  read  noise  variance. 

(5.11) 

The  likelihood  expression  in  (5.8)  has  infinite  dimensions  due  to  the  infinite  set  of  input 
parameters  a.  Since  the  estimator  is  only  concerned  with  providing  estimates  for  02,  <23, 
and  04,  these  three  dimensions  are  evaluated  independently  of  all  others.  Furthermore, 
to  reduce  the  complexity  of  the  maximization  process,  the  parameters  of  interest  will  be 
estimated  independent  of  each  other.  Two  characteristics  of  the  problem  allow  maximizing 
over  each  parameter  independently:  the  decrease  in  turbulence  power  between  tilt  and 
defocus,  and  the  separability  of  Zernike  effects  in  the  chosen  type  of  image  projections.  The 
random  CCD  images  d  will  always  be  a  function  of  the  infinite  set  a,  however,  if  a  set 
of  expected  image  vectors  can  be  precomputed  with  known  amounts  of  a  single  parameter 
then  the  likelihood  can  be  maximized  one  dimension  at  a  time.  Taking  advantage  of  the 
20:1  ratio  of  tilt  power  to  defocus,  the  tilt  parameters  will  be  estimated  first.  Tilt  reference 
projections  are  formed  by  simulating  the  beacon  image  using  an  OTF  containing  only  the 
reference  tilt  aberrations  and  a  long  exposure  OTF  containing  contributions  from  Zernikes 
Z4  and  higher.  The  Zernike  contributions  in  the  long  exposure  OTF  are  based  on  the 
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estimated  atmospheric  variables  ro,  Lq  and  Iq.  Recall  that  the  diffraction  limited  OTF 
may  be  formed  from  the  simulated  PSF: 


rt 


VDT  {1} 

Ei[u] 


(5.12) 


Note  that  the  object  is  assumed  to  be  a  point  source  and  therefore  the  expected  image  I 
is  a  diffraction  limited  PSF.  Any  aberrations  in  I  will  be  indicated  by  including  them  as 
arguments  of  I.  For  instance,  an  image,  which  is  otherwise  diffraction  limited,  that  contains 
a  contribution  of  x-tilt  is  denoted: 

I  [A2].  (5.13) 


The  corresponding  OTF  is  denoted: 


H[A2] 


VTT{\[A2\} 

EJ[u;  M] 


(5.14) 


Similarly,  a  long  exposure  OTF  can  be  formed  from  the  discrete  Fourier  transform  of  a  long 
exposure  PSF.  Consider  the  PSF  formed  from  an  ensemble  average  of  images: 


L I 

Ln 


£a{i[a]}, 
VTT{l  1} 
£  Ll[u] 


(5.15) 

(5.16) 


where  the  expected  image  i  occurs  here  in  lower  case  to  emphasize  that  it  is  a  random  quan¬ 
tity.  Notice  that  the  point  spread  function  is  given  a  preceding  subscript  L  to  distinguish 
it  from  a  diffraction  limited  point  spread  function.  Goodman  provides  an  expression  for 
the  long  exposure  OTF  in  terms  of  the  diffraction  limited  OTF  and  an  OTF  formed  from 
the  phase  structure  function,  Dp ■  |24|: 


Ln[  n]  =  Hp0[n]H[n],  (5.17) 

HP^  [n]  =  exp  (-\dp4,  [nA  •  (5-18) 


The  phase  structure  function  is  evaluated  on  a  spatial  grid.  The  OTF  is  evaluated  on  a 
spatial  frequency  grid.  A  factor  of  A s4  is  required  to  convert  between  the  spatial  frequency 
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dimension  of  Tip ^  and  the  spatial  dimension  of  Dp (j>.  The  Nyquist  relationship  between 
the  aperture  and  image  plane  sampling  grids  accounts  for  the  difference  in  spatial  versus 
spatial  frequency  dimensions  in  Dp >  and  Tip 0  and,  as  such,  accounts  for  the  use  of  the  same 
indexing  variable  n  in  both  the  left  and  right  hand  sides  of  the  expression  above.  Dp 0  is 
therefore  constructed  from  the  ensemble  averaged  autocorrelation  of  the  discretized  pupil 
phase  over  all  possible  pixel  shifts  of  the  discrete  pupil  function.  If  the  pupil  phase  does  not 
contain  contributions  from  specific  atmospheric  parameters  then  the  resulting  OTF  will  be 
referred  to  as  the  long  dwell  OTF  with  Z*  removed.  For  example,  define  the  long  exposure 
OTF  for  tilt  removed  turbulence  as: 

L23Ti  [n]  =  exp  [n;  A2  =  0,A3  =  0]^  .  (5.19) 

The  preceding  subscript  L  is  now  followed  by  the  numbers  2  and  3  to  indicate  that  the 
OTF  is  a  long  exposure  OTF  with  Zernikes  2  and  3  removed.  It  follows  then  that  the  tilt 
removed  point  spread  function  may  be  expressed  as: 


L23l  =  DDT{L23n-n},  (5.20) 

where  the  binary  operator  •  indicates  the  Hadamard  product  often  referred  to  as  an  entrywise 
or  pointwise  product: 

(A  '  B)jj  =  AijBij.  (5.21) 

Furthermore,  expected  images  with  arbitrary  tilt  and  long  exposure  variance  contributions 
from  all  high  order  Zernike  polynomials  are  formed  by: 


L23 


I  [A2,A3}=VFT{L23H-TL[A2,A3]}. 


(5.22) 


£23!  is  representative  of  the  type  of  expected  image  used  in  the  MAP  estimator.  Inserting 


the  appropriate  long  term  expected  image  projections  into  the  likelihood  expression  (5.8) 
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the  tilt  specific  likelihood  expressions  are  given  by: 


Nw 

Lmap2  iM)  =  £  [{(l,Ahr)}v«  (Dl,o)  +  ® ro\  ^ 

1=1 

in  {  {(1  ,Nw)}Vl  (L23ll,0  [^2])  +  &ro  }  “ 

A2 

{(1  ,Nw)}vl  (L23ll,0  [^2])  -  (5.23) 

Nw 

Lmap3(A 3)  =  ^  [ {(1,^)}^  (D2,9o)  +  <*1 o ]  x 

1=1 

ln  {  {(l,Nw)}Vl  (L23l2,90  [^3])  +  <7ro}  ~ 

A2 

{(1  ,Nw)}Vl  (L23l2,90  [-43])  -  -^2-  (5-24) 

Similar  to  the  method  for  creating  tilt  reference  projections,  Z\  reference  images  are  formed 
from  a  combination  of  a  known  defocus  OTF  with  a  long  exposure  OTF  containing  appro¬ 
priate  contributions  from  Z5  and  higher: 


L234I  [u] 

=  {i  [u;  a\A2  =  0,A  =  0,A4  =  0]}, 

(5.25) 

L234W  [n] 

=  exp  (^-\DP0  [n;  A2  =  0,  A3  =  0,  A4  =  0]^  , 

(5.26) 

L234I  [A*] 

=  VTT{L2^U-U\AA}. 

(5.27) 

The  set  of  Z4  expected  image  projections  must  be  preregistered  over  an  array  of  known  tilt 
values.  The  estimator  will  select  the  preregistered  Z4  projection  with  the  closest  matching 
pair  of  tilt  values: 


AA-2,  round 


(5.28) 


where  (^A2,A^J  are  formed  during  tilt  estimation  by  choosing  the  parameter  which  max¬ 
imizes  Lmap2  and  Lmap3  respectively.  Note  that  the  function  round(-)  is  a  call  to  the 
(K) 

Matlabw  rounding  function  which  outputs  the  nearest  integer  to  the  argument.  Preregis- 
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tered  Z4  expected  images  are  given  by: 


L234-I 


A2,  A3,  A4 


=  VET 


|l234^ 


H 


A4,  A-2,  A3 


(5.29) 


The  Z4  specific  likelihood  expression  is  given: 


2N\y 


Lmap4{A4)  -  [{(l,iVw)}v;  (Dl,0)  P2,9o)  +  ^ro]  X 


1=1 


In  |  {(l,Nw)}vl  (l234Ii,0  A-2,  A3,  A4  ,L234  1-2,90  A2,  A3,  A4  +  CT^ 


{(1±W)}V«  U234ll,0 


A-2,  A3,  A4 


)L234  12,90 


A-2,  A3,  A4 


A 

2<j\  ’ 


(5.30) 


5-4  Maximizing  the  Likelihood  Expression 

When  all  the  inputs  required  for  each  likelihood  expression  are  available,  the  estimator 
requires  a  fast  way  of  evaluating  and  maximizing  the  function.  Evaluating  the  likelihood 
is  made  more  efficient  by  precomputing  and  storing  banks  of  expected  image  projections. 
However,  locating  the  likelihood  maximum  can  be  computationally  expensive  and,  as  such, 
should  be  accomplished  using  as  few  evaluations  of  the  likelihood  as  possible.  The  max¬ 
imization  approach  and  the  maximum  number  of  "guesses"  used  in  any  gradient  search 
algorithm  will  be  constrained  by  the  operating  bandwidth  of  the  wavefront  sensing  system 
and  its  ability  to  address  stored  arrays  of  reference  vectors.  There  are  many  ways  to  con¬ 
figure  this  portion  of  the  estimation  algorithm.  Here  I  will  offer  one  possible  method  of 
maximization  and  the  rationale  behind  it. 

I  will  begin  by  describing  the  estimator  lookup  tables.  The  algorithm  starts  by 
generating  both  tilt  estimates  independently  using  a  bank  of  v  (1,23!)  projections  spanning 
±4a2,3  and  separated  in  H.2,3  by  0. 25(72,3-  The  pair  of  tilt  estimates  are  then  passed  to  the 
defocus  estimator  which  uses  a  bank  of  v^I)  projections  spanning  ±4<74  and  separated  by 
0.8(74.  Recall  that  each  defocus  projection  must  be  preregistered  over  an  expected  range 
of  tilt  values.  The  performance  of  the  defocus  estimator  is  not  significantly  affected  by 
tilt  estimates  accurate  within  ±0.2(72,3,  therefore  the  tilt  preregistration  grid  is  bounded 
by  ±5(72,3  with  a  step  size  of  0.25ct2,3.  These  lookup  table  bounds  and  step  sizes  can  be 
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summarized  as  follows: 


AA2,3 

=  0.5(72,3, 

(5.31) 

1^2,3 1  max 

=  ±4(72,3, 

(5.32) 

AA4 

=  0.8(74, 

(5.33) 

I^Lax 

=  ±4(74, 

(5.34) 

AA2,3 

=  0.25(72,3, 

(5.35) 

^2,3 

=  ±5(72,3, 

(5.36) 

max 


It  is  important  to  note  that  given  these  lookup  table  step  sizes  and  bounds,  the  maximum 
number  of  likelihood  evaluations  per  parameter  estimate  is  10  for  A2  and  A3  and  8  for  A4. 
These  numbers  are  based  on  a  search  beginning  with  the  three  evaluation  points  about  zero 
and  proceeding  with  a  fixed  step  gradient  search. 


The  parameter  estimates  are  formed  using  a  quadratic  curve  fit  through  the  three 
highest  points  among  the  available  steps.  Investigating  the  nature  of  the  A2  and  A3 
likelihood  expressions,  reveals  that  they  are  very  well  behaved  for  the  point  source  case.  In 
fact,  given  small  enough  lookup  step  sizes,  the  likelihood  expression  will  be  nearly  quadratic 
through  the  three  highest  points.  As  an  example,  consider  that  the  17  evaluations  of  the 


likelihood  yield  the  3  peak  points  circled  in  red  in  Figure  5.3  In  general,  the  quadratic  fit 
maximum  through  3  or  more  points  is  given  by: 


0 

II 

(5.37) 

c  =  (x*x)  1 X‘y, 

(5.38) 

Cl 

3?  max  —  0 

2  C2 

(5.39) 

However,  the  maximum,  assuming  exactly  3  points  and  a  fixed  step  size,  Ax,  simplifies  to: 


^max 


(2/1  -  2/2)  Ax  (  |  Ax 

I”  *^-'1  I  ^ 
2/1  +  2/3  -  22/2  2 


(5.40) 


This  quadratic  curve  fit  is  used  to  estimate  Zernike  4.  Figure  [574]  demonstrates  the 
quadratic  curve  fit  through  an  example  Z4  likelihood.  The  quadratic  fit  requires  only  3 
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Figure  5.3  Figure  provides  an  example  of  the  evaluation  points  and  the  quadratic  curve 
fit  used  to  form  each  tilt  estimate. 

points  over  the  parameter  range.  Judiciously  choosing  3  realizations,  for  instance:  Ax  £ 
{— 3ax,  0ax,  3 ax},  will  produce  a  faster  yet  less  accurate  estimator.  Increasing  the  distance 
between  sample  points  increases  the  error  in  the  quadratic  fit  and  increases  susceptibility  to 
errors  in  the  ro  estimate.  These  types  of  trade-off  considerations  force  the  choice  of  lookup 
table  design  parameters  to  be  specific  to  each  application. 


5.5  Sensor  Design  Variables 

The  previous  paragraphs  outlined  the  general  curvature  sensor  design  and  suggested 
some  choices  for  design  variables.  Next  I  will  discuss  the  key  design  variables  and  how 


each  variable  effects  sensor  performance.  Table  5.1  lists  the  key  variables  that  effect  the 
curvature  sensor’s  performance. 


Choice  of  subaperture  diameter  will  be  application  specific.  In  section  2.3,  defining 
the  parameter  space,  I  derived  a  very  important  fact  concerning  the  Zernike  modes  present 
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Figure  5.4  Figure  provides  an  example  of  the  evaluation  points  and  the  quadratic  curve 
fit  used  to  form  each  defocus  estimate. 

in  atmospheric  turbulence:  the  expected  power  in  each  Zernike  mode  is  based  on  the  ratio  of 
the  size  of  the  aperture,  Dp,  to  the  characteristic  turbulence  parameter,  tq.  The  inner  and 
outer  scale  parameters:  Iq  and  Lq  also  affect  performance,  but  to  a  lesser  degree.  Choice  of 
Dp  should  therefore  be  based  on  the  range  of  atmospheric  conditions  in  which  the  sensor 
will  operate  nominally.  The  largest  ratio  ^  in  which  the  sensor  operates  will  produce  the 
worst  case  performance.  On  the  other  hand,  ratios  of  <  1  place  the  curvature  sensor 
in  an  operating  environment  where  the  significance  of  Z 4  contributions  in  the  wavefront 
are  minimal.  This  condition  reduces  the  sensor’s  performance  beyond  that  of  a  tilt  only 
sensor. 

As  the  defocus  diversity  increases,  the  sensor’s  ability  to  estimate  defocus  increases. 
Unfortunately,  the  opposite  is  true  of  the  sensor’s  ability  to  estimate  tilt.  If  the  operating 
variables  are  well  known,  an  ideal  diversity  factor  can  be  selected  to  provide  the  proper 
trade-off  between  accurate  tilt  estimates  and  defocus  estimates. 
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Variable 

Description 

DP 

Aperture  diameter 

<3 

-H 

Defocus  diversity 

9i 

CCD  rotation  angles 

sV(-) 

Image  projection  operation 

^^4*)  |Aj|max 

AA2,3,  |^2,3  |max 

Step  size  and  range  of  lookup  tables 

Table  5.1  Curvature  sensor  design  parameters. 


The  CCD  angle  9j  determines  the  relative  angle  between  the  AO  reference  frame  v  axis 
and  the  image  projection  direction.  It  will  be  demonstrated  using  a  performance  bounding 
measure  and  via  simulated  performance  that  a  separation  angle,  9\  —  9-2  =  90  degrees,  is 
optimal  for  the  Z2-4  sensor.  Using  the  same  measures,  it  will  also  be  demonstrated  that  the 
performance  of  the  ^2-4  sensor  is  invariant  to  a  change  in  9 1  provided  that  the  separation 
angle  is  90  degrees.  For  this  reason,  the  ^2-4  sensor  is  set  up  with  9 \  =  0°  and  #2  =  90°. 

The  projection  operation  sv(-)  in  the  ^2-4  estimator  accounts  for  the  size  of  win¬ 
dowing  function  and  the  number  of  vector  projections.  Consequently,  sv(-)  determines 
the  number  of  pixels  read  out  of  each  CCD  array.  As  more  pixels  are  read  out  of  the 
array,  there  is  an  increase  in  information  available  to  the  estimator.  However,  there  is  also 
a  proportional  increase  in  the  amount  of  read  out  noise  and  computational  requirements. 
A  design  trade-off  must  be  made  between  information,  read  out  noise,  and  computational 
complexity.  The  increased  computational  complexity  comes  from  an  increase  in  the  number 
of  vector  points  included  in  the  likelihood  equations. 

Ideally,  the  expected  projection  lookup  tables  would  provide  an  entry  for  every  possible 
estimate.  This  is  computationally  prohibitive.  Instead  the  estimator  uses  tables  with 
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some  finite  step  size  and  range.  Decreasing  the  granularity,  A Ai,  and  increasing  the 
range,  |Aj|max,  of  the  vector  lookup  tables  improves  the  sensor’s  performance  at  the  cost  of 
decreased  speed  and  increased  memory  requirements.  The  quadratic  fit  may  include  more 
points  or  be  dispensed  with  entirely  for  some  other  approach.  Creativity  in  the  design  of 
the  maximization  routine  must  balance  the  speed,  accuracy  and  robustness  of  the  sensor. 

5.6  Summary 

This  chapter  outlined  the  design  of  a  MAP  estimator  based  curvature  sensor  to  in¬ 
clude  its  general  hardware  requirements  and  the  flow  of  the  software  algorithm.  The  sensor 
is  designed  to  estimate  3  parameters:  x-tilt,  y-tilt,  and  defocus  from  point  source  image 
projections.  Key  hardware  considerations  include  the  use  of  a  beamsplitter  to  share  the 
incoming  optical  signal  equally  between  two  programmable  CCD  arrays  and  applying  a 
defocus  diversity  in  each  optical  path.  The  software  algorithm  forms  a  MAP  likelihood 
incorporating  estimates  of  the  atmospheric  conditions:  ro,  Lq,  and  Iq,  average  photon  count, 
and  precomputed  image  projections.  With  a  maximum  of  28  likelihood  evaluations,  the 
curvature  sensor  is  capable  of  estimating  Zernikes  2  —  4.  The  next  chapter  provides  per¬ 
formance  bounding  for  the  curvature  sensor  and  demonstrates  how  the  performance  bound 
can  be  used  to  select  ideal  design  variable  settings.  Chapter  [8]  provides  simulated  perfor¬ 
mance  results.  In  simulated  cases,  the  curvature  sensor  is  capable  of  providing  improved 
performance  over  that  of  a  centroiding  tilt  sensor  and  a  projection  based  ML  tilt  sensor. 
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6.  Wavefront  Sensor  Performance  Bound 

The  mean  squared  difference  between  the  compensated  wavefront  and  the  desired  wavefront 
is  a  common  performance  measure  for  a  wavefront  sensor.  This  chapter  provides  an  analysis 
of  the  wavefront  residual  mean  squared  error  based  on  the  maximum  a  posteriori  estimator 
described  in  Chapter  [5j  This  type  of  performance  measure  is  best  suited  for  simulation 
because  it  requires  a  direct  measure  of  the  field  input  to  the  optical  system  which  is  available 
only  in  simulation.  Given  a  known  input,  it  is  straightforward  to  calculate  the  error  in  the 
wavefront  sensor  response  and  provide  statistics  on  that  error,  particularly  the  bias  and  the 
variance.  For  deeper  insight  into  performance  limits,  estimation  theory  provides  methods 
for  bounding  the  error  variance.  The  following  sections  establish  an  expression  for  the 
residual  wavefront  mean  squared  error  (MSE)  and  the  Cramer  Rao  lower  bound  (CRLB) 
for  estimator  variance.  These  measures  will  be  used  to  compare  the  performance  of  the 
MAP  estimator  to  existing  estimators  under  various  operating  conditions.  The  CRLB  will 
also  be  useful  for  determining  the  ideal  design  choices  for  the  MAP  estimator  for  a  given 
operating  environment. 

6.1  Wavefront  Mean  Squared  Error  (MSE) 

The  wavefront  MSE  must  include  both  error  due  to  the  estimator’s  imperfect  response 
and  the  error  from  additional  parameters  which  are  not  estimated.  Recall  that  a  volume 
of  turbulent  atmospheric  effects  can  be  integrated  along  the  direction  of  the  optical  path 
to  form  a  thin  phase  screen.  The  resulting  phase  screen  can  be  modeled  by  an  infinite 
series  of  Zernike  polynomials  with  coefficients,  a.  The  infinite  set  of  Zernike  coefficients 
can  be  divided  into  a  finite  set  of  parameters  to  be  estimated,  denoted  by  S,  and  an  infinite 
number  of  higher  order  coefficients: 


a  =  the  infinite  set  of  Zernike  coefficients,  (6-1) 

a  =  {a,  :  i  6  S},  the  estimated  set  of  Zernike  coefficients,  (6-2) 

a  =  {a*  :  i  S'},  Zernike  coefficients  unknown  to  the  estimator,  (6.3) 

a  =  aUa.  (6.4) 
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While  the  wavefront  sensor  attempts  to  estimate  a  small  set  of  parameters,  a,  it  will  be 
shown  that  the  remaining  parameters  a  and  the  noise  characteristics  of  the  CCD  increase 
the  estimator  mean  squared  error.  In  the  paragraphs  to  follow,  an  expression  is  derived  for 
the  residual  wavefront  error  based  on  the  selected  set  of  parameters,  a. 

Begin  the  derivation  by  assuming  that  the  ideal  wavefront  is  a  unit  amplitude,  constant 
phase,  plane  wave.  Additionally,  assume  that  the  optical  system  is  only  affected  by  the 
piston  removed  wavefront  phase,  P^,  so  the  coefficient  ai  is  ignored.  Recalling  the  aperture 
convention  and  the  relationship  between  aperture  phase  and  the  coefficients  a,  the  general 
form  for  the  field  in  the  aperture  expressed  in  continuous  polar  coordinates  is  given: 


V  (r;  a,  RP)  =  WP  (r;  RP )  exp  {jP^  (r;  a ,RP)}  ,  (6.5) 

where  RP  =  aperture  radius,  (6-6) 

r  =  (r,0),  (6.7) 

0  <  r  <  oo,  (6.8) 

0  <  9  <  2vr,  (6.9) 

and  Wp(t;Rp)  =  {J’  (6.10) 


Extracting  the  the  piston  removed  phase  expression  and  expanding  it  as  a  series  of  Zernike 
polynomials  yields: 


i^(r;a,  RP) 


where  Zj  (r;  RP) 


WP( !•;  RP)  ^2  aizi  (r;  Rp) , 

i= 2 


(6.11) 

(6.12) 


Section  2.3  demonstrated  that  the  coefficients,  at.  can  be  found  by  projecting  each  Zernike 
onto  the  wavefront  phase  as  follows: 


cii 


j  dpWz(p)Zi(p)P^(r;a,  RP) , 


(6.13) 


6-2 


where  p  represents  the  scaled  polar  coordinates: 


p  =  UUb 


and  Wz(p) 


\  1  JL_  <  1 

J  7T>  flp  —  X 


Using  this  convention,  the  mean  squared  error  in  the  compensated  wavefront  is  given  by: 


Pi)  =  E*  {  dpWz(p)  (r;  a,  RP)  -  £  a^(p) 


(6.14) 


where  the  hatted  coefficient  variable,  dt,  denotes  an  estimate  of  the  respective  random 
coefficient  a^.  Expanding  into  a  sum  of  parameters  and  separating  the  set  of  estimated 
parameters  from  the  higher  order  parameters  gives: 


Pi)  =  E*  l  I  dpWz(p)  ^a^^-^a^pj  +  ^a^p)  (6.15) 

Lies  ieS  i^s  J  I 


(6.16) 


Ea  <  /  dpW z{p)  ^2  ( ai  -  di )  Z*(p)  +  ^  ajZj(p) 

(/  Lies  i^s 


Taking  advantage  of  the  fact  that  the  Zernike  basis  functions  are  orthonormal,  it  is  possible 


to  expand  the  square  and  collect  the  nonzero  terms: 


Pl)  =  E*  |  E(a*  -  ^)2  /  dpWz(p)Zf(p)  +  ]T  a?  f  dpWz(p)Zf(p) 
lies  J  US  J 


(6.17) 


The  integral  factor  J  dpWz(p)Zf(p)  =  1  for  all  i: 


+  !>*}• 

\  J 


(6.18) 


MSE  is  in  units  of  rad2,  with  the  caveat  that  this  measure  is  intimately  tied  to  the  pupil 
area.  Exchange  the  order  of  summation  and  expectation: 


pl)  =  E  E *  -  «*)2}  +  E  E K)  • 

ieS  i^S 


(6.19) 
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This  result  shows  that  the  overall  wavefront  MSE  is  composed  of  two  summation  terms. 
The  first  sum  represents  the  estimator  MSE.  Assuming  that  the  estimator  is  unbiased,  this 
term  represents  the  estimator  variance.  The  second  term  represents  the  total  variance  of 
all  remaining  parameters  in  the  atmospheric  model  which  will  be  denoted  (P2  > : 


r, 


runcorr 


EEb?} 

i<£S 


(6.20) 


Recall  the  expression  for  the  covariance  of  Zernike  coefficients  derived  from  the  von  Karrnan 


atmospheric  model  in  (2.138 ).  If  i  =  i!  and  m  =  m!  the  expression  simplifies  to  the  variance 
of  the  a„s: 


E  {af }  =  0.4898  •  24/3tt  '  (n(*)  +  l)(-l)n(i 

i(K) 


d,K 


^n{i)+ 1  ^ 


k{k2  +  R2pK  q)11/6 


exp 


—  K 

R2  K 2 
iXphm 


(6.21) 


Using  this  expression,  it  is  possible  to  approximate  the  lower  bound  on  MSE  for  the  case  of 
perfect  compensation  of  parameters  in  the  set  S  by  numerically  evaluating  the  integral: 


P2 


)  =  |Z0-4898‘24/37r  (~) 7  Mo  +  i)(-ir(i)-m(i) 


dn 


T2 

Jn{i)+ 1 


(«) 


K(K2  +  R2pt «2)H/6 


exp 


— n 

K>2  k2 


(6.22) 


As  a  reminder,  the  functions  n(i)  and  m(i)  were  provided  in  Table  2.3  This  expression 


relates  residual  MSE  to  the  atmospheric  parameters  and  the  aperture  size.  Figure  6.1 


contains  a  plot  of  the  residual  MSE  as  it  relates  to  the  ratio  ^  for  several  sets  of  esti¬ 
mated  parameters  S.  In  this  case,  Dp  was  fixed  at  0.07m  while  ro  varied  over  the  range: 
{0.02m. ..0.2m}.  The  plot  demonstrates  that  for  a  given  sensor  design,  residual  MSE  will 
vary  as  atmospheric  conditions  change.  The  plot  also  reveals  that  the  advantage  of  es¬ 
timating  additional  parameters  decreases  for  higher  Zernike  modes.  Unfortunately,  the 
wavefront  sensor  will  not  provide  a  perfect  set  of  estimates.  The  vector  based  curvature 
sensor  must  contend  with  compressed  image  information  and  CCD  noise.  Additionally,  the 
defocus  diversity  required  for  higher  order  modal  compensation  increases  error  variance  in 
the  tilt  estimates.  All  of  these  factors  will  cause  an  increase  in  the  residual  wavefront  error 


6-4 


S=0 


by  the  sum  of  the  error  variances  in  the  estimated  parameter  set: 

(pl)  =  !>{(<«- ai)2}  +  (PL„)  ■  <6-23) 

ieS 

Defining  how  sensor  design  variables  affect  the  variance  term,  and  consequently  how  they 
affect  (p^  offers  a  tool  for  adjusting  the  sensor  to  provide  minimum  residual  error  under 
a  given  set  of  atmospheric  conditions.  The  Cramer  Rao  lower  bound  is  one  method  for 
characterizing  each  error  variance  term.  The  next  section  will  derive  the  CRLB  and  relate 
the  CRLB  to  key  sensor  design  variables. 


6.2  The  Cramer  Rao  Lower  Bound 

The  following  section  attempts  to  bound  the  performance  of  the  vector  based  wavefront 
sensor  based  on  the  limits  set  by  the  Cramer  Rao  lower  bound.  The  estimator  must 
be  unbiased  in  order  to  apply  the  CRLB.  For  this  reason,  assume  that  the  estimator  is 
unbiased  or  that  the  bias  is  not  a  function  of  the  parameter  or  system  variables  and  can 
be  removed.  Van  Trees  provides  the  expression  for  the  CRLB  for  unbiased  estimators  of 
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random  parameters  |8j: 


E 


{( ai-ai  (R))2} 


>  JS. 


(6.24) 


Where  J ^  is  the  nth  element  of  iTl 


Jt  is  the  K  x  K  square  matrix  formed  from 
Jp,  commonly  referred  to  as  Fisher’s  information  matrix,  and  J p,  the  matrix  of  a  priori 
information: 

Jp  =  Jp  +  Jp-  (6.25) 


The  ij  th  element  of  Jp  is  defined  [81: 


JDij  =  -E 


d 2  lnpr|a(R|  A) 
dAidAj 


(6.26) 


The  ij  th  element  of  the  a  priori  information  matrix  is  defined  [8j: 


JP  =  -E 


f  d2  In  pa (A)  | 

1  dAidAj  J 


(6.27) 


Combining  these  results,  it  is  easy  to  recognize  that  the  form  of  Jp  contains  the  same  internal 


expression  from  the  previously  defined  MAP  estimator  (2.17).  The  MAP  estimator  was 
given  as: 


max 

A 


{lnpr|a(R|A)  +  lnpa(A)} 


A=a 


(6.28) 


while  the  form  for  J p  is: 


Jt  =  —E 


d2  [lnpr|a(R| A)  +  lnpa(A)] 


dAidAj 


(6.29) 


Aside  from  the  vector  versus  single  parameter  notation,  the  function  to  be  maximized  for 
hmap  is  the  same  expression  to  be  differentiated  in  J p.  This  expression  is  sometimes  referred 
to  as  the  log  likelihood  expression.  The  CRLB  is  in  essence  a  measure  of  the  average 
curvature  or  second  derivative  of  the  log  likelihood  expression.  Due  to  its  significance  and 
continued  recurrence  throughout  the  remainder  of  this  dissertation,  I  will  define  the  MAP 
and  ML  log  likelihood  expressions: 


Lmap{  A)  =  lnpr|a(R|A)  +  lnpa(A), 
^mli-A.)  lnpr|a(R|  A). 


(6.30) 

(6.31) 
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In  the  CCD  detector  projection  model,  the  observed  process  r  is  replaced  by  one  or  more 
compressed  arrays  of  pixels  svz  (du).  Thus  the  probability  density,  pr(R),  in  van  Trees’ 
notation  will  be  replaced  by,  psVi(d^)(sVZ  (Du)),  the  pdf  for  a  detected  image  projection. 
The  final  form  for  the  projection  based  MAP  estimator  was  derived  from  the  assumption 
that  the  distribution  of  du  is  a  combination  of  signal  dependent  Poisson  shot  noise  and 
Gaussian  read  noise.  The  combination  of  Poisson  shot  noise  and  Gaussian  read  noise  was 
approximated  by  a  biased  Poisson  process,  d  =Poisson{l  +  a20}  —  a20  123] .  The  prior 
density  pa(A)  is  jointly  Gaussian.  Recall  the  MAP  estimator  expression: 


max 

/ 

'  EdT"  [.V,  (Du)  +  o  X  ' 
In  {  svz  (Iu  [A])  +a20}~ 

AAg1At 

A  | 

2 

V 

sVj  (Iu  [a]) 

> 

(6.32) 


From  the  MAP  estimator,  both  the  MAP  and  ML  log  likelihood  expressions  can  be  extracted 
for  use  in  the  CRLB  calculations  to  follow: 


Lmap  (A) 
Lml(  A) 


l=1  [sV*(Du)  + 


Pro]  X 


In  {SV|  (Iu  [A])  +CFro}  ~  S V;  (Iu  [A]) 


AAsA 


(6.33) 


NVNW 


[svi  (Du)  +  cr2ro\  In  {  gV/  (Iu  [A] )  +  a2ro }  -  svz  (iu  [A] )  .  (6.34) 


i=i 


The  infinite  parameter  vector,  A,  must  be  reduced  to  some  limited  parameter  set,  A.  When 
evaluating  the  lower  bound,  the  intent  is  to  model  the  operating  environment  as  accurately 
as  possible.  This  requires  including  as  many  Zernike  modes  as  possible.  Unfortunately, 
as  the  number  of  parameters  increases,  call  that  number  N,  the  complexity  of  the  CRLB 
calculation  increases  as  N2.  As  such,  the  set  of  parameters  must  be  truncated  at  a  point 
where  error  and  computation  time  are  both  acceptable.  Under  this  approximation,  the 
infinite  parameter  vector  A  becomes  A.  The  remainder  of  the  expressions  in  this  section 


will  substitute  A  for  A  in  all  instances  of  Lmap  and  Lmi  above.  Substituting  (6.33)  into 


the  expression  for  J t  in  (6.29)  yields: 


J  T  =  ~E 


d 2 


dAidAj 


E,T"'  [.vi  (Du)  +  ^]x 

In  {  svz  (Iu  [A])  +  a2ro)  -  svz  (Iy  [A]) 


AAa1A* 


(6.35) 
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Evaluating  the  second  partial  derivative  of  the  prior  gives: 


J  T 


-E 


d 2 

dAidAj 


J  D  +  Aa  1  • 


ENy  Nw 
1=1 


[s V;  (Du)  +  0-2J 


X 


In  {  SV*  (Iu  [A])  +  (Jro}  -  s Vi  (Iu  [A]) 


+  A"\  (6.36) 
(6.37) 


In  (6.36),  the  Fisher  information  matrix,  Jp,  contains  an  expected  value  operator  and 
partial  derivatives.  A  simpler  expression  for  JD  without  partials  or  expectation  integrals  is 
desired.  Begin  by  evaluating  the  two  partial  derivatives  of  the  conditional  density  portion 


of  the  log  likelihood  function,  Lmi( A)  in  (6.34).  Evaluating  the  first  partial  derivative 
yields: 


d 

dAi 


NVNW 

E 


i=i 


SV;  (Du)  +  &ro  d 

sv i  (Iu  [A])  +  at0dAi 


{sVi  (Iu  [A])}  - 


d 

dAi 


{s vz  (Iu  [A])}, 


NvNw  p. 

E  ^{.v.dulA])} 

1=1  1 


SVf  (Du)  +  O'ro 
_SVZ  (Iu  [A])  +  <720 


Continuing  to  evaluate  the  second  partial  derivative  gives: 


(6.38) 

(6.39) 


d2Lmi(  A) 
dAidAj 


NyNw  r  .  ,  2 

sVz  (Du)  +  a;0 


LsVj  (Iu  [A])  +  a 


-  1 


d2{sv*  (Iu  [A])} 
dAidAj 


(6.40) 


sV|  (Du)  +  a: 


2 

ro 


sv i  (Iu  [A])  +  o , 


2 12  dAi 


8  {.vUMaihAu.vUMA])}.  (6.41) 


The  log  likelihood  contains  a  generic  projection  operator,  sv;  (•).  The  projection  operator 
is  presented  in  generic  form  to  indicate  that  this  derivation  holds  for  all  possible  projection 
operations.  The  linear  nature  of  the  projection  operator  allows  it  to  commute  with  the 
differentiation  operator: 


d2Lml(  A) 
dAjdAj 


NVNW  r 


sVj  (Du)  +  cr 


2 

ro 


LsV«  (I[u;  A])  +  cr. 

r2 

TO 

n2  sVz 


-  1 


svi 


sv;  (Du)  +  a2 

(Iu  [A])  +cr 


2 

roJ 


d 

~dA, 


Iu  [A] 


d2 Iu  [A]\ 
dAjdAj  J 

•VI  (al“lu  [A1)  ■ 


(6.42) 
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Substituting  these  results  into  the  expression  for  Joi;j  in  (6.26): 


JDij  = 


-Ed,  8 


E 


NVNW 

1=1 
s V|  (Dul+O-r 

SV;  (Iu  [A])  +  (Tp0] 


sV|  (Du)+°~ro 

sVl  (Iu[A])+ff2 


-  1 


svZ 


f92Iu[AA  _ 


2  ]2  S' 


M7Xu  IA 


sVl 


dAidAj 

M7Xu  tA 


The  detected  images,  Du,  are  the  only  random  quantities  with  Poisson  noise  within  the 
derivative  expression.  Simplify  the  expression  by  evaluating  the  Poisson  part  of  the  expec¬ 
tation: 


Jdh  =  Ea 


=  Ea 


E 


NVNW 


n= i 

Ed{sVi  (Du))A}+ct2 
[SVl  (Iu[A])  +  (Tp0]2 
\N,NW 


i  _  gdlsv;  (Du)l  A}+q20 
av;  (Iu[A])+cr20 


sfclu  [A 


E 


1  - 


1=1 

Vi(gd{Du|A})+<r?, 
[sV;  (Iu  [A]  )+cr^0] 2 


v;  (gd{Du|A})+o-20 
svi  (Iu  [A]  )+crpo 


*VZ  [  M7Xu  [A 


/d2Iu[A]\ 

sV*  ,1  + 

sv*  (m-xu  [a 

fa2iu[A]\ 
sV*  ^  SAiSAj  J  + 

[a; 


(6.43) 


(6.44) 


where  -Eu{-}  represents  the  expectation  taken  over  the  CCD  noise  and  Ea{-}  represents 
the  expectation  over  the  set  of  random  parameters  a.  To  move  the  expectation  operation 
inside  svz  (•),  I  have  once  again  taken  advantage  of  the  linearity  of  the  projection  operator. 
Evaluating  the  £u{-}  operation,  the  expectation  on  D  removes  the  CCD  noise  resulting  in 
the  expected  image  I: 


Jdh  =  Ea 


=  Ea 


1  - 


ENv  Nw 
1=1 

BVj  (Iu  [A])+g20 


sV|  (Iu  [A])+cr20 
s v;  (Iu[A])+ct20 


/92Iu[A]\ 
sV*  dAidAj  )  + 


[svi  (Iu[A])+crp, 
VZ 


2  S V; 


rajlu [A 


sv/ 


8Aj XU  [A 


NVNW 

E 

z=i 


d 

8Ai 


[A]) 


5vz 


q  t 

dAj  U 


[A]) 


s vz  (Iu  [A])  +  a 


2 

ro 


(6.45) 

(6.46) 


Evaluating  the  CRLB  will  require  a  closed  form  expression  for  the  derivative  of  the 
expected  image,  I  [A].  The  derivation  here  follows  the  results  provided  by  Fienup  et.  al. 
[53 j .  Assume  the  case  of  a  point  source  object.  Compact  the  pupil  notation  by  removing 
the  explicit  dependency  on  Ap  and  Rp :  assume  that  the  amplitude  function  for  the  field 
in  the  aperture  plane  is  a  constant  value  of  one  and  Rp  =  1.  The  resulting  expression  for 
the  pupil  is: 

V  [n;  A]  =  WP  [n]  exp  {jP^  [n;  A]}  .  (6.47) 
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Using  the  linear  model  of  the  optical  system,  the  expected  image  for  the  case  of  a  point 
source  object  is  given  by: 


I  [u;  A] 


R  %  [u;  A]  X*  [u;  A] 
^J[u;  A]  X*  [u:  A] 

U 

K1  [u;  A]  X*  [u;  A]  . 


(6.48) 

(6.49) 


Normalization  by  the  average  photon  count  K  models  the  SNR  in  each  image  plane.  The 
constant  K  is  introduced  to  compact  the  notation.  An  analytical  expression  for  the  image 
derivative  requires  differentiating  the  image  with  respect  to  the  Zernike  parameter.  Ignoring 
the  constant  which  will  be  replaced  by  SNR  scaling,  and  differentiating  yields: 

Ai[u;a]  =  4jI[u;A]r[.;A](,  (6.50) 

=  Ap|u;A]}r[u;A]  +  A{i*[u;A]}I[u;A].  (6.51) 

The  image  field,  X,  is  calculated  via  the  discrete  Fraunhofer  diffraction  integral  of  the  pupil 
function: 


X  [u;  A] 


£^IVT{P[n;A]}, 
(A  Si) 

AxA  y 

(As. 

AxAy 

(As;)2 


tn;  A1  exP  |-J^  [n  '  u]|  , 
^2  WP  [n]  exp  {jP^  [n;  A]}  exp 


.2n 

[n'u] 


(6.52) 

(6.53) 

(6.54) 
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Exchanging  the  order  of  summation  and  differentiation,  it  is  easy  to  show  that  = 


d 

dAi 


d 


-1  [u;  A]  =  ^2wP(n)—  [exp  {jP0  [n;  A]}] 


d 

dA 


dAi 

exP  {  -j^(x-u)  \  , 
d 


-1*  [u;  A]  =  ^fEp(x)— [exp{-j^[n;A]}] 


2tt 


exp<iJ— [n-uj 


d 

dA, 


1  [u;  A] 


Substituting  (|6.57[)  and  (|6.51[)  into  (|6.49[): 
d 


dA 


I  [u;  A]  =  K 


8  [I[u;A]]r[u;A]  +  (  AI[u;A|)  I[u;A] 


dA 


(6.55) 


(6.56) 

(6.57) 


(6.58) 


in  (6.56): 


To  further  define  the  image  derivative,  evaluate  the  derivative  of  the  exponential  phase  term 

A  Q 

—  [exp  {jP0  [n;  A]}]  =  j  exp  {jP0  [n;  A]}  g^-P<t>  [n;  A] .  (6.59) 

The  phase  function,  P^,  and  its  derivative  are  defined  for  each  of  the  Zernike  modes: 


d 

Mi 


P<t>  [n;  A] 

=  AiZi  [n]  +  ^2  Aizi  tn]  > 

(6.60) 

leS 

P<t>  [n;  A] 

=  Zi  [n] . 

(6.61) 

Substituting  (|6.61[)  and  (|6.59[)  into  (|6.55[)  gives: 
d 


dAi 


-1  [u;  A]  =  j  WP  tn]  In]  exP  {jp0  A1 ) 
.2? r  r 


exp  i  -J-^  [n  •  uj 


=  ]DTT  {WP  [n]  Zi  [n]  exp  {jP^,  [n;  A]}} 


(6.62) 

(6.63) 
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Substituting  (6.63)  into  (6.58)  produces  the  derivative  of  the  expected  image: 


^-1  [u;  A]  =  j KVTT  {  WP  [n]  Z{  [n]  exp  {jP0  [n;  A]}}  T  [u;  A]  + 

K  {{VET  {WP  [n]  Zi  [n]  exp  {j-F^  [n;  A]}})*  1  [u;  A]  ,  (6.64) 

=  -2klm{VFT{Wp[n]Zi[n}ex.p{jP(j>[n-,A}}}T[u;A}}.  (6.65) 


Using  this  closed  form  expression  for  the  image  derivative,  the  Fisher  information  matrix 


entry  in  (6.45)  becomes: 


■h)i:j  =  Ea  < 


ENvNw  4K2 

1=1  sv;  (I[u;A])+cr20  X 

sv i  (Im  {VET  {WP  [n]  Z*  [n]  expjjP^  [n;  A]}}T*  [u;  A]})  x 
sV;  {Ira.  {VET  {W P  [n]  Zj  [n]  exp{jP^  [n;  A]}}  J*  [u;  A]}) 


(6.66) 


Combining  this  result  with  the  a  priori  matrix,  Jp,  yields  a  straightforward  method  for 
evaluating  the  CRLB.  The  only  nontrivial  calculation  is  the  expectation  over  the  parameter 
set  a.  This  integral  cannot  be  evaluated  analytically.  Instead  it  may  be  approximated 
using  a  Monte  Carlo  simulation.  The  Monte  Carlo  simulation  requires  a  sequence  of 
random  atmospheric  realizations  formed  using  an  appropriate  distribution  of  the  parameters 
a.  Using  random  atmospheric  realizations,  an  ensemble  average  of  Jps.  values  can  be 
computed.  The  average  value  of  Jp  is  then  used  to  form  Jp: 

S  3d* 

Jt  =  ^—  +  Aa\  (6.67) 


where  Jp,,  represents  a  single  Monte  Carlo  realization  of  the  matrix  Jp,  and  represents 
the  number  of  Monte  Carlo  trials.  Thus,  the  lower  bound  on  residual  mean  squared  error 
is  given  by: 

(Pl)  =  TLasCe  {' I?1}  +  E  E  K)  ’  (6‘68) 

i£S 

where  the  trace  of  the  matrix  is  taken  over  only  the  parameters  included  in  the  set  S. 
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6.3  Adjusting  Design  Variables  to  Minimize  CRLB 


There  are  many  variables  contributing  to  estimator  performance,  some  are  environ¬ 
mental  variables  over  which  the  system  designer  has  little  choice,  but  others  offer  flexibility 
to  the  designer.  Flexible  wavefront  sensor  design  variables  should  be  adjusted  to  some 
ideal  setting  for  a  given  set  of  environment  variables.  The  ideal  sensor  configuration  can  be 
defined  as  that  configuration  of  flexible  design  variables  which  minimizes  CRLB  for  some 
set  of  environment  variables.  Recall  the  list  of  key  environment  and  design  variables  from 


Section  |5.5[  The  CRLB  is  calculated  by  Monte  Carlo  simulation  using  atmospheric  phase 
screens  that  contain  a  limited  set  of  Zernike  polynomials.  Due  to  the  limited  set  of  Zernike 
polynomials  present  in  the  simulation,  there  may  be  a  significant  difference  between  actual 
performance  and  the  derived  lower  bound.  With  this  in  mind,  the  designer  may  view  the 
minimum  CRLB  settings  as  a  starting  point  when  adjusting  the  design  to  optimize  perfor¬ 
mance.  In  the  next  two  subsections,  I  will  demonstrate  using  the  CRLB  to  make  ideal 
design  choices  for  a  whole  plane  projection  sensor  and  a  half  plane  projection  sensor.  The 
whole  plane  projection  sensor,  the  Z2-4  sensor,  was  discussed  in  Chapter  [5]  The  half  plane 
projection  includes  two  vectors  from  each  CCD  and  is  the  image  information  used  by  the 
Z2-10  sensor  to  be  detailed  in  Chapter  [9j 


Whole  Plane  Projection  CRLB.  The  ^2-4  sensor  design  variables  along  with  the 
operational  variables  constitute  a  multi-dimensional  domain  space  for  the  CRLB  function. 
An  exhaustive  search  for  ideal  design  settings  would  involve  minimizing  the  CRLB  over  the 
range  of  all  design  variables  for  every  location  in  the  operational  space.  Rather  than  attempt 
such  a  global  minimization,  I  will  examine  the  CRLB  at  a  limited  number  of  operating  points 
and  the  minimization  will  be  conducted  with  respect  to  each  design  variable  independently. 
Further,  to  reduce  the  amount  of  analysis  shown  here,  the  operational  space  will  be  limited 
to  a  range  over  the  Fried  parameter,  tq.  and  the  average  photon  count  per  subaperture  per 
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exposure  period,  I\ .  The  set  of  design  variables  will  be  limited  to: 


82  —  0\  =  projection  separation  angle,  (6.69) 

8 1  =  the  first  CCD  projection  angle,  (6.70) 

±5ai  =  defocus  diversity,  (6-71) 

and  N\y  =  window  size.  (6.72) 


Within  the  inertial  range,  the  analytical  expression  for  the  phase  spectrum  relies  heavily  on 


the  ratio  For  this  reason,  performance  within  the  inertial  range  is  largely  a  function  of 
the  ratio  ^  rather  than  each  variable  independently.  For  this  reason,  I  have  chosen  to  fix 
Dp  and  vary  r q.  This  will  provide  an  indication  of  how  performance  varies  with  the  ratio 
without  independently  varying  both  Dp  and  tq.  For  all  CRLB  plots  included  in  this  section, 
the  subaperture  diameter  will  be  fixed  at  0.07m  and  the  environment  variables  Lq  and  Iq 


are  fixed  at  Lq  =  10m,  Iq  =  0.01m.  In  Figures  |6.2|  through  6.7  the  environment  variable 


ro  is  fixed  at  0.05m.  Figure  6.2  demonstrates  how  the  CRLB  for  residual  mean  squared 
error  varies  with  respect  to  separation  angle  between  the  two  CCD  arrays.  The  CRLB 
indicates  that  a  separation  angle  of  approximately  90  degrees  between  image  projections  is 
ideal.  Using  this  ideal  separation  angle,  the  CRLB  can  be  plotted  for  varying  start  angle. 
For  instance,  Figure  [673]  shows  how  CRLB  varies  with  respect  to  choice  of  projection  angle 
6 1  given  a  separation  angle  of  90  degrees.  The  CRLB  varies  randomly  over  the  range  of 


8 1  angles  in  Figure  6.3  The  variance  of  the  CRLB  is  small  enough  to  indicate  that  no 
significant  change  in  the  lower  bound  occurs  over  the  range  of  8 \  values.  This  reveals 
that  for  a  fixed  separation  angle  of  90  degrees  between  projections,  the  CRLB  is  effectively 
invariant  to  CCD  rotation.  Based  on  these  results,  the  remaining  CRLB  plots  will  depict 
configurations  where  8\  =  0°  and  8 2  =  90°.  F igure [674] shows  how  CRLB  indicates  the  ideal 


defocus  diversity  for  a  high  SNR  case.  Examining  the  plot  in  Figure  6.4,  the  CRLB  is 


minimized  when  the  defocus  diversity  is  approximately  0.35  radians.  Similarly,  Figure  6.5 


shows  how  CRLB  indicates  the  ideal  defocus  diversity  in  a  low  SNR  case  is  approximately 
0.15  radians.  Given  the  ideal  diversity  choices,  the  CRLB  may  be  used  to  select  a  window 


length.  Figure  6.6  demonstrates  how  CRLB  varies  over  window  length  in  high  SNR. 


Figure  6.7  demonstrates  how  CRLB  varies  over  window  length  in  low  SNR.  The  CRLB 
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CRLB  for  Z  Estimator  Residual  MSE  [rad 


Projection  Separation  Angle  [deg] 


Figure  6.2  Z2-4  estimator  ^  lower  bound  versus  separation  angle,  O2  —  #i-  Dp  = 

0.07m,  o>0  =  2.13  counts,  ro  =  0.05m,  Lq  =  10m,  Zo  =  0.01m,  and  Wn  =  14 
pixels. 


Figure  6.3  ^2-4  estimator  ^P|  ^  lower  bound  versus  projection  angle  6b  given  that  62  = 

01  +  90.  Dp  =  0.07m,  aro  =  2.13  counts,  ro  =  0.05m,  Lq  =  10m,  Iq  =  O.Olrn, 
and  1'Fv  =  14  pixels. 
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Figure  6.4  Z2-4  estimator  (p^  ^  lower  bound  versus  ±5a4  for  K  =  1000  photons  per 

subaperture  (high  SNR).  Dp  =  0.07m,  o>0  =  2.13  counts,  ro  =  0.05m, 
Lq  =  10m,  Iq  =  O.Olrn,  and  Wjy  =  9  pixels. 
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Figure  6.5  Z2-4  estimator  (p^  ^  lower  bound  versus  ±<5a4  for  K  =  100  photons  per 

subaperture  (low  SNR).  Dp  =  0.07m,  aro  =  2.13  counts,  ro  =  0.05m,  Lq  = 
10m,  Iq  =  O.Olrn,  and  Wn  =  9  pixels. 


CRLB  for  Z0  „  Estimator  Residual  MSE  [rad  2]  ®  CRLB  for  Z  Estimator  Residual  MSE  [rad  ; 
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Window  Length  [pixels] 

.6  Z2-A  estimator  (p‘^  lower  bound  versus  N\y  for  K  =  1000  p 

subaperture  (high  SNR).  Dp  =  0.07nr,  aro  =  2.13  counts,  ro 
Lq  =  10nr,  and  Iq  =  O.Olnr. 


S 


Window  Length  [pixels] 

6.7  Z2-A  estimator  ^  lower  bound  versus  N\y  for  K  =  100  photo: 
aperture  (low  SNR).  Dp  =  0.07nr,  aro  =  2.13  counts,  ro  =  0.05nr, 
and  In  =  O.Olnr. 


versus  Nyy  results  demonstrate  that  the  benefit  gained  from  increasing  the  window  length 
beyond  11  pixels  is  minimal.  In  practice,  the  ideal  window  length  will  be  driven  largely  by 
the  maximum  amount  of  time  allotted  for  CCD  read  out  and  the  time  required  to  manage 
the  additional  pixels  in  each  image  projection.  The  CRLB  plots  here  are  useful  in  that 
they  demonstrate  a  knee  in  the  performance  curve  around  the  7—11  pixel  range.  Using  the 
middle  of  the  knee  on  the  CRLB  versus  N\y  plots,  I  have  selected  a  9  x  9  pixel  window  and 
plotted  CRLB  over  a  range  of  ro  and  K  values.  The  CRLB  plot  in  Figure [R8] demonstrates 
performance  using  a  9  x  9  pixel  window.  Numbers  at  each  point  for  the  ^2-4  CRLB  plot 
indicate  the  ideal  defocus  diversity  for  that  point  in  the  operating  space. 

Define  the  operating  space  as  a  set  of  environment  variables  each  with  some  expected 
range  of  values.  The  CRLB  can  be  used  to  identify  a  region  within  the  operating  space  where 
a  particular  type  of  estimator  is  optimal.  As  an  example,  Figure  [R8] provides  a  comparison 
of  CRLB  for  the  Z2-4  sensor  for  the  case  where  only  <22  and  <23  are  estimated,  define 
this  as  the  estimator,  versus  the  case  where  parameters  <22  through  <24  are  estimated. 
Dashed  lines  indicate  2^3  estimator  performance  while  solid  lines  indicate  ^2-4  estimator 
performance.  Both  cases  are  demonstrated  over  a  range  of  ro  and  K.  The  ordered  pairs  in 
parentheses  on  the  right  hand  side  of  the  plot  region  indicate  the  ratio  ^  and  the  value  of 
ro  for  the  plot  lines  ending  closest  to  the  ordered  pair.  Once  again,  numbers  at  each  plot 
point  location  for  the  Z2-4  estimator  indicate  the  ideal  choice  of  diversity  for  that  point  in 
the  operating  space.  Note  that  the  ideal  choice  for  diversity  for  the  2^3  estimator  is  always 
0  radians  and  therefore  the  plot  line  points  for  Z2$  estimator  have  no  diversity  indicators. 
This  figure  is  included  because  it  provides  an  indication  of  the  region  in  the  operating  space 
where  there  is  opportunity  for  increased  performance  via  estimating  Z4.  Specifically,  the 
figure  indicates  that  there  is  a  limit,  in  the  lower  left  of  the  plot  region,  as  SNR  decreases 
and  as  ro  increases  (as  the  ratio  ^  decreases)  beyond  which  there  is  little  or  no  benefit 
from  estimating  04. 

Figures  |6.2|  through  |6.7|  demonstrate  that  minimum  CRLB  is  a  useful  measure  for 
determining  ideal  design  variable  settings.  Figure  |6.8|  demonstrates  that  the  CRLB  is 
also  useful  for  identifying  regions  in  the  operating  space  where  a  particular  estimator  can 
provide  superior  performance.  Actual  performance  will  only  match  the  CRLB  in  the  event 
that  the  estimator  is  efficient.  Thus,  the  assumption  is  that  actual  estimator  performance 
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Figure  6.8  Lower  bounds  on  j  versus  K  for  several  cases  of  ?’o-  Dashed  lines  indi¬ 
cate  2^2,3  estimator  performance  bounds.  Solid  lines  indicate  Z2-H  estimator 
performance  bounds. 

trends  the  same  as  the  CRLB  such  that  the  CRLB  provides  a  good  starting  point  for  design 
choices.  Along  with  performance  comparisons  to  other  estimators,  Chapter  [8]  will  provide 
performance  plots  to  compare  with  the  CRLB  results  presented  here.  Also,  the  degree  to 
which  simulated  performance  reflects  the  same  ideal  design  variable  settings  will  determine 
the  effectiveness  of  using  CRLB  to  make  such  preliminary  design  choices.  Comparing 
simulated  performance  to  the  CRLB  provides  a  means  with  which  to  verify  that  the  sensor 
simulation  is  producing  valid  results.  In  this  manner,  simulated  performance  and  CRLB 
results  serve  to  compliment  one  another. 


Half  Plane  Projection  CRLB.  In  addition  to  the  whole  plane  projection  sensor 
discussed  in  Chapter  [5j  this  dissertation  will  provide  a  discussion  detailing  a  half  plane 
projection  sensor  in  Chapter  [9]  The  half  plane  projection  sensor  is  designed  to  estimate 
coefficients  for  Zernike  polynomials  Z2  through  Z\q.  Just  as  in  the  previous  subsection,  a 
set  of  figures  containing  CRLB  results  for  the  half  plane  projection  sensor  are  provided  here 
as  a  demonstration  of  how  CRLB  can  be  used  to  choose  ideal  design  settings.  Beginning 


with  projection  separation  angle  and  projection  starting  angle,  Figures  6.9  and  6.10  show 
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that  the  results  for  whole  plane  projections  hold  true  for  half  plane  projections  as  well:  the 
ideal  separation  angle  is  90  degrees,  and,  given  that  62  =  9\  +  90°,  the  CRLB  is  effectively 


invariant  over  a  range  of  9\  values.  Figures  |6.11  and  6.12  demonstrate  that  the  ideal 


Figure  6.9  ^2-10  estimator  ^ P |  ^  lower  bound  versus  separation  angle,  62  —  9 Dp  = 

0.07nr,  aro  =  2.13  counts,  ro  =  0.05m,  Lq  =  10m,  Iq  =  O.Olrn,  and  Nw  =  14 
pixels. 


diversity  settings  are  approximately  0.55  radians  in  high  SNR  and  0.4  radians  in  low  SNR. 
Although  the  values  for  the  ideal  diversity  in  each  case  here  differ  from  the  Z2-4  sensor 
cases,  there  is  a  trend  worth  emphasizing:  ideal  diversity  decreases  as  SNR  decreases.  This 
is  an  indication  that  higher  order  coefficients  are  difficult  to  estimate  in  low  signal  situations. 
In  these  situations,  the  sensor  performs  better  by  reducing  diversity.  Reducing  the  amount 
of  diversity  in  the  sensor  improves  tilt  and  lower  order  coefficient  estimates  at  the  cost  of 


higher  order  coefficients.  The  CRLB  plots  in  Figures  6.13  and  6.14  suggest  that  the  knee 
in  performance  improvement  with  respect  to  CCD  window  size  occurs  in  the  8  —  12  pixel 


range.  Finally,  Figures  6.15  and  6.16  provide  plots  of  CRLB  over  a  region  of  the  operating 
space.  In  these  figures,  several  design  variables  are  fixed:  9\  =  0,  $2  =  90,  and  Nw  =  9 
pixels.  The  environment  variables  Lq  and  Iq  are  also  fixed  at  Lq  =  10m  and  Iq  =  O.Olrn. 
CRLB  is  plotted  as  vq  varies  in  several  steps  from  0.04m  to  0.14m  and  as  K  varies  from 
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•2  0.0705 


D  Projection  Angle  [deg] 

Figure  6.10  ^2-10  estimator  ^  lower  bound  versus  projection  angle  9i  given  that 

O2  =  9i  +  90.  Dp  =  0.07m,  uro  =  2.13  counts,  ro  =  0.05m,  Lq  =  10m, 
Iq  =  O.Olrn,  and  N\y  =  14  pixels. 


Defocus  Diversity  [rad] 

Figure  6.11  ^2-10  estimator  ^  lower  bound  versus  ±<5a4  for  K  =  1000  photons  per 

subaperture  (high  SNR).  Dp  =  0.07m,  aro  =  2.13  counts,  ro  =  0.05m, 
L0  =  10m,  Iq  =  O.Olrn,  and  Nw  =  9  pixels. 


CRIB  for  Z21Q  Estimator  Residual  MSE  [rad  ]  ^  CRLBforZ21Q  Estimator  Residual  MSE  [rad 
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Defocus  Diversity  [rad] 

12  Z2-10  estimator  ^  lower  bound  versus  ± Sa4  for  K  =  100  photons  per 

subaperture  (low  SNR).  Dp  =  0.07m,  aro  =  2.13  counts,  ro  =  0.05m, 
L0  =  10m,  Iq  =  O.Olrn,  and  N\,y  =  9  pixels. 


Window  Length  [pixels] 

Figure  6.13  Z2-10  estimator  lower  bound  versus  Nw  for  K  =  1000  photons  per 

subaperture  (high  SNR).  Dp  =  0.07m,  aro  =  2.13  counts,  ro  =  0.05m, 
L0  =  10m,  and  Iq  =  O.Olrn. 


Figure  6.14  Z2-10  estimator  ^  lower  bound  versus  N\y  for  K  =  100  photons  per 

subaperture  (low  SNR).  Dp  =  0.07m,  aro  =  2.13  counts,  ro  =  0.05m, 
Lq  =  10m,  and  lo  =  O.Olrn. 

100  to  1000  average  photons.  The  lower  bound  for  the  ^2-10  sensor  is  compared  to  the 
bound  for  estimating  tilt  only  (the  Z2,3  sensor).  Dashed  lines  indicate  the  ^2,3  sensor  bound 
while  solid  lines  indicate  the  Z2-10  sensor  bound.  Numbers  at  each  plot  point  location  for 
the  ^2-10  sensor  indicate  the  ideal  choice  of  diversity  for  that  point  in  the  operating  space. 
Once  again,  the  ideal  diversity  for  estimating  tilt  only  is  assumed  to  be  zero. 

6.4  Summary 

This  chapter  proposed  that  the  MSE  performance  measure  provides  a  sound  method 
for  comparing  simulated  wavefront  sensor  designs.  The  discussion  began  with  a  derivation  of 
the  residual  MSE  in  terms  of  the  von  Karrnan  atmospheric  modal  parameters.  A  few  cases  of 
perfect  estimator  performance  were  plotted  in  order  to  demonstrate  the  relationship  between 
residual  wavefront  MSE,  the  set  of  parameters  estimated  and  the  size  of  the  aperture.  In 
the  section  on  CRLB,  the  relationship  between  wavefront  residual  MSE  and  estimator  error 
was  highlighted.  Considering  the  wavefront  sensor  as  a  parameter  estimator,  it  was  easy  to 
see  that  the  portion  of  the  wavefront  MSE  which  can  be  affected  translates  directly  to  the 
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Figure  6.15  Lower  bounds  on  versus  K  for  several  cases  of  tq .  Dashed  lines  indi¬ 

cate  .2/2,3  estimator  performance  bounds.  Solid  lines  indicate  Z2-10  estimator 
performance  bounds. 


Figure  6.16  Lower  bounds  on  y  versus  K  for  several  cases  of  tq.  Dashed  lines  indi¬ 
cate  .2/2,3  estimator  performance  bounds.  Solid  lines  indicate  Z2-10  estimator 
performance  bounds. 
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estimator  mean  squared  error.  For  unbiased  estimators,  the  estimator  mean  squared  error 
becomes  an  estimator  variance.  The  Cramer  Rao  lower  bound  provides  a  lower  bound  on 
estimator  variance.  A  method  was  developed  for  approximating  the  CRLB  for  the  case 
where  the  sensor  CCDs  are  modeled  as  a  Poisson  process.  This  lower  bound  provides  a 
basis  for  validating  simulated  sensor  performance  and  for  selecting  ideal  settings  for  sensor 
design  variables.  The  CRLB  was  computed  and  example  plots  were  provided  for  both  the 
whole  plane  and  half  plane  projection  based  sensors.  In  each  case,  CRLB  was  used  to 
identify  ideal  settings  for  key  design  variables  and  regions  within  the  operating  space  where 
each  sensor  has  the  potential  to  offer  improved  performance  over  the  tilt  only  sensor.  It 
is  proposed  that  the  CRLB  will  provide  a  complimentary  performance  measure  with  which 
to  compare  to  simulated  performance.  The  combination  of  CRLB  results  and  simulated 
performance  results  should  serve  to  provide  confidence  in  the  sensor  simulation  results  and 
in  the  use  of  CRLB  as  means  of  establishing  initial  design  variable  settings. 
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7.  Simulating  the  Atmosphere 

Simulating  the  wavefront  sensor  requires  an  implementation  of  the  atmospheric  model  de¬ 
rived  in  Section  |2.2|  Recall  two  of  the  key  assumptions  within  the  atmospheric  model: 
segments  of  the  atmosphere  can  be  compressed  into  discrete  layers,  and  the  atmospheric 
phase  is  ergodic.  These  characteristics  serve  as  guidelines  for  determining  the  suitabil¬ 
ity  of  random  process  generation  techniques  for  creating  realizations  of  the  statistical  phase 
model.  The  first  assumption  allows  dividing  the  atmosphere  into  discrete  layers,  referred  to 
as  the  thin  screen  approach.  The  random  effects  of  phase  delay  in  each  layer  are  condensed 
into  an  infinitesimally  thin  phase  modifier.  The  assumption  of  ergodicity  implies  that, 
within  each  thin  screen,  spatial  statistics  equal,  in  a  statistical  sense,  appropriate  temporal 
statistics  [53].  Using  this  assumption,  a  temporally  evolving  atmosphere  can  be  modeled 
under  the  assumptions  of  Taylor’s  frozen  flow.  Taylor’s  frozen  flow  hypothesis  assumes 
that  the  random  nature  of  the  phase  in  each  layer  is  frozen  except  for  a  constant  velocity 
perpendicular  to  the  optical  axis  [3].  Therefore,  to  accommodate  both  the  thin  screen  and 
ergodicity  assumptions,  the  simulation  must  be  able  to  generate  thin  phase  realizations  of 
arbitrary  length.  The  Fourier  series  method  of  phase  screen  generation  is  a  good  fit  for 
both  of  these  criteria.  Application  of  the  Fourier  series  technique  using  an  equal  spaced 
Cartesian  sampling  structure  introduces  significant  computational  complexity.  Logarith¬ 
mically  spaced  sample  structures  have  been  suggested  to  reduce  complexity  and  emphasize 
lower  frequencies  [55].  Below,  I  will  investigate  the  log-Cartesian  sampling  structure  and 
suggest  a  log-polar  modification  to  improve  performance.  To  measure  the  fidelity  of  each 
technique,  I  compare  the  statistics  of  the  random  screens  to  the  derived  atmospheric  model. 
Many  phase  screen  generation  codes  use  a  visual  inspection  of  the  structure  function  as  a 
verification  of  performance  [56],  m-  i  will  modify  this  measure  in  two  ways.  First,  I  will 
use  structure  function  percent  error  as  a  performance  measure.  Second,  I  will  identify  the 
level  of  isotropy  in  the  resulting  screens  by  measuring  structure  function  percent  error  over 
a  range  of  angles.  The  curvature  sensor  simulations  discussed  in  Chapters  [8]  and  [9]  include 
an  implementation  of  the  log-polar  phase  screen  technique.  Statistical  errors  in  the  phase 
screens  have  the  potential  to  effect  the  sensor  simulation  results.  In  particular,  the  Z2-4 
and  ^2-10  sensors  rely  on  accurate  representation  of  lower  order  Zernike  variance.  For  this 
reason,  I  include  the  variance  of  the  estimated  Zernike  coefficients  in  the  phase  screen  per- 
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formance  measure.  The  sections  to  follow  provide  a  review  of  Fourier  series  phase  screen 
generation,  a  note  on  the  benefits  of  log-polar  frequency  sampling,  and  a  comparison  of 
phase  screen  statistical  accuracy  when  using  log-polar  versus  log-Cartesian  sampling. 

7.1  Fourier  Series  Phase  Screen  Generation 

The  mathematical  foundation  for  the  atmospheric  simulation  was  outlined  in  Section 

2.2  where  I  presented  the  derivation  of  the  optical  application  of  Kolmogorov’s  turbulence 
model  and  the  von  Karmari  phase  power  spectrum,  <£p  (Note:  the  Fourier  series  technique 
is  not  limited  to  any  particular  power  spectrum).  All  that  remains  is  to  formulate  a 
method  for  creating  random  phase  realizations  with  the  same  statistical  characteristics  as 
the  atmospheric  model.  The  Fourier  series  method  is  a  common  technique  for  creating 
random  processes  from  spectral  statistics.  The  following  paragraphs  provide  a  review 
of  the  derivation  of  the  Fourier  series  technique.  Begin  with  the  well  known  statistical 
relationship  between  the  power  spectrum,  <£p  ,  and  the  autocorrelation,  Bp of  a  wide 
sense  stationary  (WSS)  random  process,  P [22]: 


(7.1) 


This  relationship  along  with  F^,  the  Fourier  transform  of  P^,  is  necessary  to  develop  the 
method.  Consider  a  realization  of  the  pupil  phase,  P^,  denoted  P^.  Under  the  definition 
of  WSS,  Fff,  is  delta  correlated  on  K  with  average  value  equal  to  the  power  spectrum  Tp^ 
[221.  To  demonstrate  this  property,  begin  with  the  Fourier  transform  of  P^: 


(7.2) 


Forming  the  correlation  yields: 
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Applying  (//)*  =  //*  gives: 


e{f^K1)F^(K2)}=E< 


P4Ri)P;(R2)x 
exp{-j  (Ki-Ri  -  K2-R2)}  o?RidR2 


. 


(7.4) 


Exchanging  the  order  of  integration  and  expectation  yields: 


E  =  ||e{p0(R1)P(;(R2)}x 

exp{-j  (Ki-Ri  —  K2-R2)}  dRidR2, 


(7.5) 


where  the  expectation  is  the  ensemble  average  over  many  instances  of  Fp.  Making  the 


substitution:  Ri  =  R2  —  R,  and  applying  the  property  in  7.1  above: 


p{^(Ki)F;(K2)}  =  j  exp  {— jR2  •  (Ki  —  K2)}  dR2  x 

j  Bp^(R)  exp  {— jKi-R}  dB  , 
=  5  (K],  —  K2)  (27t)4  <hp  (Kx). 


(7.6) 

(7.7) 


Thus,  the  correlation  of  the  transform  of  the  random  process  is  related  to  the  power  spec¬ 
trum.  This  relationship  suggests  that  it  is  possible  to  generate  a  realization  of  by  filtering 

a  complex  white  noise  process,  N,  with  the  power  spectrum.  For  instance,  suppose  the  form 
2  i 

FA K)  =  (270^  (K)  N  (K).  Inverse  transforming  F^  (K)  would  yield  P^,  an  instance 
of  P,],.  The  question  remains:  what  are  the  requirements  for  the  noise  process?  The  noise 
must  have  an  appropriate  mean  and  variance.  Begin  by  solving  for  the  mean: 

e[f^{K))=  j  p{^(R)}  exp  {— jK  •  R}  dR.  (7.8) 

The  expected  value  of  P$  is  zero  per  the  atmospheric  model.  Substituting  the  suggested 
form  for  into  the  left  hand  side  above  and  evaluating  the  expectation  gives: 

(27t)24,(K)£{!V(K)}  =  0,  (7.9) 

E  {Re  (IV(K)}}  4-  ]E  {Im  {N (K)}}  =  0.  (7.10) 
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Thus,  the  real  and  imaginary  parts  of  N  must  be  zero  mean.  Now  find  the  second  moment 
by  substituting  the  suggested  into  (7.7)  and  simplifying: 


Ef  (2tt)2^(K1)N(K1)x  1 

{  (27T)24,(K2)iV*(K2)  J 

(2vr)2  4,  (K)  (2vr)2  4,  (K)  E  {1V(K)IV*(K)} 

E{N(K)N*(K)} 


(27r)4<5(K1-K2)d>p,(K1),(7.11) 

(2tt)4^(K),  (7.12) 

1.  (7.13) 


Expanding  N  into  its  real  and  imaginary  components  and  simplifying: 


I  (Re{iV(K)}+jIm{iV(K)})x  I  =  i 
\  (Re  {N (K)}  —  j  Im  {N (K)})  J 
E{(Re{lV(K)})2}+E{(Im{lV(K)})2}  =  1. 


(7.14) 

(7.15) 


From  (7.10)  it  is  clear  that  N  must  be  zero  mean  to  create  the  zero  mean  process  P^, 


however,  according  to  (7.15)  the  variance  of  N  depends  on  the  combination  of  its  real  and 
imaginary  parts.  If  is  a  real  process,  then  the  real  and  imaginary  parts  of  F, 0  must  be 
considered  independently.  Implying  that  both  the  real  and  the  imaginary  parts  must  each 
have  a  variance  of  1: 


E  |  (Re  (1V(K)})2  j-  =  1, 
E  j(Im{lV(K)})2 1  =  1. 


(7.16) 

(7.17) 


Thus,  the  form  for  is  acceptable  provided  that  the  real  and  imaginary  parts  of  N  are  zero 
mean,  unit  variance  WSS  processes.  Combining  these  results,  the  inverse  Fourier  transform 
provides  a  method  for  creating  realizations  of  the  phase  process  as  follows: 


^(R) 


(2? r) 


(27t)2  4>J>  (K)  N  (K)  exp  {jK  •  R}  dK, 


Tp  (K)  N  (K)  exp  {jK  ■  R}  dK, 


(7.18) 

(7.19) 
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Finally,  separating  P^  into  real  and  imaginary  components,  the  method  produces  two  inde¬ 
pendent  realizations: 


P<j>t  (R)  =  Re{4(R)},  (7.20) 

R)  =  Im{p0(R)},  (7.21) 

where  Re{lV(K)},  and  Im{IV(K)}  are  distributed  AA(0, 1).  This  technique  is  acceptable 
where  many  random  screens  are  needed,  however,  if  only  a  single  screen  is  required  for 
simulation,  it  is  common  to  take  advantage  of  the  spectral  symmetry  in  real  functions. 
Recall  that  for  a  real  valued  function,  x(t):  P  {a?(t)}  =  P*  {x(t)}.  Using  this  relationship, 
it  is  possible  to  create  N  as  a  Hermitian  symmetric  Gaussian  noise  process  with  real  and 
imaginary  parts  distributed  M  ^0,  : 

u{(Re{lV(K)})2}+u{(Im{lV(K)})2}  =  1,  (7.22) 

u{(Re{A(K)})2}  =  i  (7.23) 

u{(Im{A(K)})2}  =  i  (7.24) 


Hermitian  symmetry  in  N  guarantees  that  is  a  real  random  process.  The  benefit  of 
producing  a  single  real  screen  is  apparent  when  examining  the  reduction  in  computational 
complexity  in  the  Fourier  series  approximation. 


In  a  computer  application,  the  continuous  Fourier  transform  must  be  approximated 
by  a  finite  Fourier  series: 


^[R] 

where  K. 
and  A  [Kj] 


_  /4>P  [Kd  \  2 

£  N  tK']  exp  ■  R> A  ^  > 

K i&K  '  L  d  / 

|k?;  :  [Kj]  A  [Kj]  =  J  4>p„  (£)  , 

[ 

J  rii 


(7.25) 

(7.26) 

(7.27) 


fij  is  a  member  of  a  set  of  bounded  regions  in  K  such  that:  Un‘  =  the  set  of  all  K,  and 
=  0.  The  substitution  of  brackets,  [•],  for  parentheses,  (•),  indicates  that  continuous 
valued  functions  are  being  evaluated  at  a  set  of  discrete  locations.  The  Fourier  transform 
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has  been  approximated  by  a  Fourier  series  evaluated  at  discrete  locations  K,  within  the 
domain  K.  The  domain  K  has  been  divided  into  a  countably  infinite  set  of  regions  f 2j.  A 
single  location  K,;  within  each  region  in  flt  is  chosen  such  that  the  integrated  power  in  the 
region  fij  equals  the  spectral  density  function  evaluated  at  that  location  multiplied  by  the 
area  of  the  region.  The  set  1C  is  the  set  of  discrete  K,  locations  to  be  included  in  the  series 
approximation.  The  function  A  [K,]  represents  the  K  domain  area  included  in  each  region 
fij.  For  the  case  where  Hermitian  symmetry  is  enforced  in  the  complex  coefficients  A,  the 
expression  simplifies  to: 

Re  {A  [Kj]}  cos  (K;  •  R)  - 
Im  {A  [K*]}  sin  (K,:  •  R) 

where  the  set  1C'  includes  a  unique  half  of  the  Hermitian  symmetric  set  1C.  Generating  a 
real  P^  requires  half  the  number  of  computations  expended  in  creating  a  complex  P^. 

If  P(f,  is  to  be  generated  on  a  set  of  points  where  the  x  and  y  locations  are  known 
in  advance,  then  the  trigonometric  function  evaluations  can  be  precomputed  and  stored  to 
speed  up  final  evaluation  of  the  Fourier  kernel.  Expressing  the  K  domain  frequencies  into 
x  and  y  components,  the  independent  kernel  computations  for  a  complex  are: 


m r]=  E  2A^k i]4,[Ki] 

K  i&K' 


(7.28) 


exp  {jK  •  R}  =  exp  {j KXRX}  x  exp  {j KyRy}  . 
The  kernel  components  for  a  real  P^  are: 


(7.29) 


Re  {A  (K)}  cos  (K  •  R)  — 
Im  {A  (K)}  sin  (K  •  R) 


cos  (KXRX) 


sin  (KXRX) 


Re  {N  (K)}  cos  (KyRy)  — 
Im  {N  (K)}  sin  (KyRy) 

Re  {N  (K)}  sin  (KyRy)  + 
Im  {A  (K)}  cos  (KyRy) 


(7.30) 


7.2  Improving  Isotropy  and  Reducing  Kernel  Size 

Given  either  formula  for  generating  the  phase  screen  implementation  begins  by 
selecting  a  finite  set  of  discrete  frequency  locations  to  be  included  in  1C.  The  finite  set  of 
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boundary  regions  fi*  will  not  include  the  entire  domain  K.  The  domain  K  is  instead  trun¬ 
cated  such  that  the  bulk  of  the  power  spectrum  is  represented  in  the  series  approximation. 
Nyquist  sampling  requires  that  the  K  domain  is  bounded  by  a  maximum  frequency  dictated 
by  the  minimum  spatial  separation  in  the  screen.  The  spatial  resolution  is  in  turn  selected 
to  adequately  represent  the  chosen  spectrum  or  to  satisfy  the  requirements  of  a  given  simu¬ 
lation,  whichever  condition  is  most  restrictive.  Assume  that  spectral  representation  is  the 
driving  requirement.  In  the  case  of  a  von  Karrnan  spectrum  it  is  important  to  adequately 
represent  spectral  content  within  the  inertial  range.  With  this  requirement  in  mind,  the 
minimum  spatial  dimension  should  be  linked  to  the  inner  scale,  Iq. 


VI 

< 

^0 

3’ 

(7.31) 

/max  — 

1  >  3 

2Ax  -  21q  ’ 

(7.32) 

7T  ^  37T  37T  K/  jyi 

(7.33) 

^max  — 

Ax  ~  l0  5.92  ’ 

K  = 

K  . 

(7.34) 

The  choice  of  Ax  <  ^  ensures  that  the  high  frequency  cut  off  lies  beyond  the  inertial  range. 
Recall  that  beyond  the  von  Karrnan  spectrum  decreases  exponentially,  therefore,  the 
bulk  of  the  spectral  power  occurs  below  Km.  Often,  a  minimum  frequency  is  also  specified. 
This  is  necessary  when  the  power  spectrum  is  not  absolutely  integrable,  as  with  Kolmogorov, 
or  when  simulation  restrictions  require.  If  the  von  Karrnan  spectrum  is  truncated,  then 
the  minimum  frequency  should  be  related  to  the  outer  scale: 


f  ■  <  1 

J  min  Is  -i  n  T  j 

tU-Lo 

2ir  _  k0 
Kmin  ~  10L0  “  10 ’ 


(7.35) 

(7.36) 


Choosing  fm-m  <  will  ensure  that  /m;n  is  below  the  inertial  range.  Once  the  represen¬ 
tative  region  in  K  is  established,  a  method  is  required  for  delineating  the  disjoint  regions 
ST  j.  If  the  Fourier  series  coefficients  are  evaluated  on  an  equally  spaced  Cartesian  grid  to 
take  advantage  of  Fast  Fourier  Transform  (FFT)  algorithms,  the  grid  size  becomes  compu¬ 
tationally  prohibitive.  For  example,  the  K  grid  for  the  case:  Lq  =  10m  and  Iq  =  O.Olrn 
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#  of  K  locations 

real 

complex 

V2 

420 

840 

Q 

2 

760 

1520 

4 

3120 

6240 

Table  7.1  The  number  of  K  grid  locations  for  a  given  Q:  Lq  =  10m  and  Iq  =  0.01m 


would  contain  300002  locations.  Reducing  this  number  by  half  for  the  real  versus  complex 
implementation  still  remains  too  cumbersome. 

Taking  advantage  of  the  exponentially  decaying  power  spectrum,  it  is  possible  to 
compress  the  frequency  domain  information  using  a  logarithmically  spaced  Cartesian  grid 
m-  In  this  case,  the  logarithm  of  the  K  boundaries  are  equally  spaced.  This  is  sometimes 
referred  to  as  a  constant  Q  frequency  spacing: 


Q 

K 

A  K 


K 

A/c’ 

sample  point, 
sample  bandwidth. 


(7.37) 

(7.38) 

(7.39) 


The  sizes  of  the  resulting  K  grids  for  several  Q  values  are  listed  in  Table  |7.1[  The  tabled 
grid  sizes  are  many  orders  of  magnitude  smaller  than  the  grid  size  required  for  equal  spaced 
sampling.  The  reduction  in  computational  complexity  due  to  logarithmic  frequency  com¬ 
pression  is  overwhelming.  Of  course,  these  numbers  are  meaningless  if  the  frequency  domain 
compression  leads  to  unsatisfactory  statistical  error  in  the  screen  realizations.  The  follow¬ 
ing  section  provides  a  demonstration  of  structure  function  percent  error  and  error  in  lower 
order  Zernike  variance  for  an  implementation  using  these  Q  values.  Before  reviewing  the 
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performance  of  the  log-Cartesian  sampling  method,  however,  I  will  suggest  another  sampling 
scheme. 


Figure  7.1  (Left)  An  example  of  the  equispaced  Cartesian  sample  structure.  (Center) 
Log-Cartesian  sampling.  (Right)  Log-polar  sampling. 

The  final  form  for  the  phase  spectrum  4>p^  is  a  function  of  the  magnitude  k  rather 
than  K.  In  an  effort  to  take  advantage  of  the  radially  symmetric  nature  of  the  power 
spectrum,  I  suggest  evaluating  the  Fourier  series  over  a  log-polar  grid.  On  a  log-polar  grid, 
the  radial  or  k  axis  is  sampled  logarithmically  while  the  angular  axis  ip  is  sampled  equally. 
This  pattern  of  sampling  offers  a  few  advantages.  Symmetric  spacing  of  samples  in  the 
angular  dimension  reduces  anisotropy  in  the  resulting  screen  realizations.  4>p^  is  symmetric 
in  the  angular  dimension  within  each  sample  region.  This  symmetry  reduces  the  integral 
required  to  find  appropriate  entries  in  /C  to  a  single  dimension,  k: 

K  i  =  (Ki,ipi),  (7.40) 

where  the  ip^s  are  chosen  by  equally  spacing  an  arbitrary  number  of  ip  locations  from  0  to 
27 r  radians  at  each  distinct  k  entry  and  the  Kp s  are  selected  to  satisfy: 

A«i  =  [  ^(0#,  (7-41) 

J  tti 

and  A m  =  /  d£.  (7.42) 

Jfii 

f li  (not  bold  so  as  to  be  distinguished  from  1L()  is  a  member  of  a  set  of  one-dimensional 
bounded  regions  in  k  such  that:  |^J  =  the  set  of  all  k,  and  [^|  fl*  =  0.  This  same 

symmetry  reduces  the  number  of  unique  power,  4>p^,  and  area,  A,  calculations.  Finally, 
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using  the  polar  sample  structure  makes  the  choice  of  sample  density  along  n  independent  of 
the  number  of  samples  in  ip.  This  is  opposed  to  log-Cartesian  sampling  where  the  choice 
of  Q  dictates  the  density  of  samples  in  both  spectral  dimensions.  In  this  way,  log-polar 
sampling  provides  increased  flexibility  over  the  log-Cartesian  structure. 


7.3  Comparing  Sampling  Methods 

Performance  results  using  both  log-Cartesian  and  log-polar  frequency  sampling  are 


presented  here  for  comparison.  For  instance,  Figures  7.2  and  7.3  demonstrate  percent 
structure  function  error  at  Q  =  \l2  for  log-Cartesian  and  log-polar  sampling  respectively. 
Each  simulation  run  includes  the  following  set  of  standardized  inputs:  ro  =  0.088m,  Lq  = 
10m,  Iq  =  O.Olrn,  Km;n  =  and  /-emax  =  77  •  Structure  function  percent  error  is  evaluated 
at  1000  logarithmically  spaced  sample  points  in  |R|.  To  demonstrate  isotropy,  the  direction 
of  each  vector  of  separation,  R,  is  also  evaluated  at  5°  increments  in  direction  from  0°  to 
45°.  The  solid  plot  lines  indicate  the  minimum  and  maximum  structure  percent  error 
over  the  range  of  directions.  The  +  symbols  indicate  the  spatial  distances  corresponding 
to  the  set  of  discrete  locations  /C  along  the  Ky  =  0  axis.  To  keep  the  number  of  kernel 
samples  consistent  between  the  two  methods  of  sampling,  the  number  of  samples  in  each 
concentric  ring  of  the  log-polar  grid  is  equal  to  the  number  of  samples  in  each  concentric 
square  in  the  log-Cartesian  grid.  The  exception  is  the  log-polar  case  for  Q  =  4  in  Figure 


7.7  For  this  case,  I  have  elected  to  make  the  number  of  ip  samples  constant  for  all  k  rings 


at  24  (increments  of  15°  in  ip).  Organizing  the  sampling  grid  in  this  manner  decreases  the 


total  number  of  grid  points  from  3120  to  468.  A  comparison  of  the  results  in  Figures  7.6 


and  |7.7|  demonstrates  that  the  log-polar  technique  can  provide  reduced  structure  error  and 
improved  isotropy  even  after  an  85%  reduction  in  the  number  of  kernel  grid  points. 

In  addition  to  limiting  structure  function  error,  maintaining  accurate  measure  of  error 
in  lower  order  Zernike  mode  variances  is  important  to  the  sensor  simulation.  The  accuracy 
of  the  Zernike  variance  within  the  set  of  estimated  parameters  can  affect  the  performance 
of  the  estimator  by  way  of  introducing  error  in  the  environment  variable  estimates.  Table 


7.2  contains  Zernike  coefficient  variance  results  estimated  from  100,  000  screen  realizations. 


For  this  simulation,  I  maintained  the  same  set  of  input  parameters:  ro  =  0.088m,  Lq  =  10m, 
Iq  =  O.Olrn,  Kmin  =  Kmax  =  7%  and  added  the  diameter  of  a  subaperture,  D  =  0.07m. 
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Figure  7.2  Structure  function  percent  error  versus  separation  distance  for  log-Cartesian 
sampling  with  Q  =  \f2. 
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Table  7.2  Percent  error  in  Zernike  coefficient  variance  per  varying  Q  value 


Zernike 

coefficient: 

02,3 

04 

05,6 

O  7,8 

09,10 

Oil 

Q  =  4, 

Aip  =  15° 

0 

+3 

+2 

-1 

+  1 

-2 

Table  7.3  Percent  error  in  Zernike  coefficient  variance. 


Table  7.3  is  provided  to  document  Zernike  variance  for  the  case  where  ip  sampling  is  constant. 
Recall  the  case:  Q  =  4  with  24  ip  samples  in  each  k  ring.  Table [773] shows  that,  even  after 
reducing  the  total  number  of  sample  points  from  3120  to  468,  the  screen  variance  statistics 
yield  similar  error  within  the  estimated  Zernike  set. 


Reducing  the  size  of  the  kernel  offers  an  advantage  in  screen  generation  time,  however, 
the  compression  in  frequency,  while  not  evident  in  the  average  structure  or  Zernike  basis 
error,  becomes  evident  from  visual  inspection  of  individual  phase  screens.  Upon  inspection, 
screens  generated  using  logarithmic  compression  (of  any  sampling  scheme)  exhibit  visible 


offer  examples  of  screens  created  using  log  Cartesian  and  log  polar  frequency  sample  grids 
respectively.  In  these  example  screens,  the  frequency  density  parameter  Q  is  set  to  \/2 
resulting  in  a  kernel  size  of  420  samples.  The  third  example  screen  (Figure  [7. 10[)  combines 
Q  =  3.6  and  Aip  =  15°  using  log  polar  sampling  in  order  to  get  a  kernel  size  of  420 
samples.  Each  of  these  examples  demonstrates  the  described  pattern  effects.  This  gallery 
of  examples  is  provided  merely  to  document  that  the  compression  becomes  visible  in  the 
output  screens  and  not  to  down  play  the  use  of  log  frequency  compression  techniques.  The 
measures  of  quality  are  statistical  based  rather  than  based  on  visual  aesthetics.  These 
examples  may  exhibit  visible  compression  effects,  but  they  guarantee  a  specific  statistical 
quality  threshold. 


patterns  as  a  result  of  the  decreased  number  of  frequency  modes.  Figures  |7.8|  and  7.9 


7-4  Summary 

The  Fourier  series  method  for  random  process  generation  is  ideal  for  creating  the  type 
of  phase  realizations  required  in  thin  screen,  frozen  flow  atmospheric  simulations.  The 
discussion  above  included  a  review  of  the  mathematical  background  for  creating  random 
process  realizations  using  the  Fourier  series  technique.  Applying  the  Fourier  series  technique 
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to  phase  screen  generation  revealed  the  need  to  trade  reduced  accuracy  in  the  Fourier 
series  approximation  for  reduced  computational  complexity.  The  frequency  domain  sample 
grid  becomes  too  large  when  formed  using  an  equally  spaced  Cartesian  sampling  structure. 
This  difficulty  exposed  the  importance  of  using  some  method  of  compression  in  frequency. 
Constant  Q  frequency  sampling  increases  sampling  density  toward  lower  frequencies  and, 
as  such,  provides  an  ideal  sample  structure  for  compressing  decreasing  power  law  functions. 
It  has  been  shown  that  log-Cartesian  grids  offer  an  acceptable  method  for  compressing 
atmospheric  spectral  models  and  effectively  reducing  the  complexity  in  the  Fourier  series 
calculations  for  phase  screen  generation  m,  m-  i  have  shown  that  log-polar  sampling 
further  reduces  the  number  of  frequency  grid  points  required  for  some  maximum  percent 
error  in  the  structure  function  while  increasing  isotropy  in  the  screen  realizations.  Log- 
polar  sampling  also  introduces  symmetry  in  the  sample  structure,  which  when  combined 
with  symmetry  in  the  spectrum,  reduces  the  complexity  of  calculating  the  set  of  frequency 
domain  sample  points.  These  attributes  make  the  log-polar  sampled  Fourier  series  the 
method  of  choice  for  the  core  of  the  atmospheric  simulation. 
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Figure  7.6  Structure  function  percent  error  versus  separation  distance  for  log-Cartesian 
sampling  with  Q  =  4. 


Figure  7.7  Structure  function  percent  error  versus  separation  distance  for  log-polar  sam 
pling  with  Q  =  4  and  24  equal  spaced  ib  samples  in  each  concentric  k  band. 


Figure  7.8  Example  1024  x  1024  pixel  phase  screen  created  using  log-Cartesian  frequency 
sampling:  Q  =  \/2,  ro  =  0.088m,  L0  =  10m,  10  =  0.01m,  Ax  =  0.0032m. 


Figure  7.9  Example  1024  x  1024  pixel  phase  screen  created  using  log-polar  frequency 
sampling:  Q  =  y/2,  ro  =  0.088m,  L0  =  10m,  10  =  0.01m,  Ax  =  0.0032m. 
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Figure  7.10 


Example  1024  x  1024  pixel  phase  screen  created  using  log-polar  frequency 
sampling:  Q  =  3.6,  Aif)  =  15°. 
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8.  Simulating  the  Z2- 4  Wavefront  Sensor 

The  sections  to  follow  provide  a  brief  description  of  the  simulation  techniques  used  and  the 
results  of  simulated  Z2~a  sensor  performance.  The  purpose  of  the  simulation  is  to  provide 
a  proof  of  concept  for  the  sensor  design  and,  as  such,  should  include  sufficient  rigor  to 
provide  insight  into  whether  the  sensor  algorithm  is  viable  and  worthy  of  further  research. 
The  results  to  follow  demonstrate  that  the  Z2-4  sensor  outperforms  ML  and  centroiding 
techniques  for  point  source  wavefront  sensing. 


8.1  Constructing  the  Simulation 


The  simulation  is  divided  into  four  segments:  the  source,  the  atmosphere,  the  optical 


system  and  the  sensor  algorithm.  The  diagram  in  Figure  8.1  shows  these  high  level  seg¬ 
ments.  With  the  exception  of  the  sensor  algorithm,  these  segments  are  intended  to  model 


Figure  8.1  Simulation  block  diagram. 

the  most  significant  environmental  effects  on  sensor  performance.  The  algorithm  itself  is 
meant  to  mimic  implementation  as  precise  as  possible  with  a  computer  simulation.  In  the 
following  sections,  I  will  present  each  of  the  segments  as  they  exist  for  this  simulation  and 
discuss  any  associated  assumptions. 

Source  and  Atmospheric  Propagation  Models.  The  source  is  assumed  to  be  in  the  far 
field  and  is  modeled  as  a  plane  wave  input  into  the  atmospheric  model.  It  is  assumed  that 
some  object  (distant  star  or  beacon)  is  emitting  incoherent  light  and  is  spatially  unresolvable 
by  the  optical  system.  Assume  also  that  the  optical  system  passes  some  quasimonochro- 
matic  band  which  will  be  represented  by  the  center  wavelength  A.  Signal  to  noise  ratio  is 
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regulated  within  the  optical  system  image  plane  and  therefore  the  amplitude  of  the  source 
is  arbitrary.  The  second  segment  in  the  simulation  implements  the  atmospheric  model 


derived  in  Section  2.2  The  atmospheric  simulation  replaces  the  column  of  atmosphere  be¬ 
tween  the  source  and  the  optical  system  with  a  thin  phase  screen.  Within  the  model,  light 
propagates  in  a  vacuum  between  the  source  and  the  screen  which  is  located  at  the  optical 
system  aperture.  The  log  polar  Fourier  series  method  discussed  in  Chapter  [7]  is  used  to 
generate  phase  screens  with  atmospheric  variables:  ro,  To,  and  Zo,  and  input  parameters: 
Q  =  4  and  A i/j  =  5°. 


The  Optical  System.  The  simulated  optical  system  is  assumed  to  be  aberration  free. 
The  only  aberration  in  the  system  is  the  known  defocus  diversity  required  by  the  wavefront 
sensor  algorithm.  The  optical  system  model  begins  with  a  discretized  aperture.  Modeling 
a  circular  aperture  on  the  Cartesian  grid  requires  some  approximation.  The  projection 


operation  for  calculating  Zernike  coefficients  given  in  Section  2.3  must  be  discretized: 


ai  =  J  dpWz(p)Zi(p)P^(pRp,6) ,  (8.1) 

(8-2) 


Here,  Wz  [n],  represents  the  discrete  circular  weighting  function.  If  the  integrated  value  of 
the  mask  used  to  represent  Wz  [n]  is  not  exactly  1,  the  error  will  affect  Zernike  coefficient 
calculations.  For  this  reason,  the  pixels  intersected  by  the  edge  of  the  circular  mask  are 
carefully  weighted  by  calculating  the  area  of  the  trapezoidal  and  chordal  regions  indicated 


in  Figure  8.2  Propagation  from  the  pupil  to  the  image  plane  is  performed  via  the  linear 
systems  approach  using  a  scaled  Discrete  Fourier  Transform: 


1  [u;  A] 

-  (Asj)2  E  p  K  A]  [-“]}■ 

(8.3) 

=  (^flVT{P[n:A]}, 

(A  Si) 

(8.4) 

I  [u;  A] 

K  l^[u;A]|2 

Y  \x  Iu;A]l2’ 

(8.5) 

=  K  \T  [u;  A]  2  . 

(8.6) 
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,  Circular  aperture  boundary 


Chord  region 


Figure  8.2  Diagram  of  a  single  pixel  bisected  by  a  circular  arc  near  the  perimeter  of  a 
circular  aperture  placed  over  a  Cartesian  grid. 


The  scale  factor  K  accounts  for  SNR  in  the  CCD  array  by  forcing  the  total  intensity  in 
the  image  plane  to  equal  some  average  photon  count  K.  Note  that  the  average  photon 
count,  K,  is  a  per  subaperture  count  and,  as  such,  must  be  divided  among  the  total  num¬ 
ber  of  image  planes  and  scaled  by  the  efficiency  of  the  beam  splitting  device.  The  Dp 
diameter  circular  aperture  is  inscribed  in  an  JV  x  IV  sample  aperture  grid  and  the  rela¬ 
tionship  between  the  sample  dimensions  in  the  aperture  plane  to  the  image  plane  grid  is: 
[An,  At;]  =  An^^m^  .  Due  to  its  efficiency,  the  Fast  Fourier  Transform  (FFT)  is 
used  to  compute  the  DFT  operation.  The  FFT  requires  that  the  aperture  plane  and  image 
plane  be  composed  of  the  same  size  grid.  For  this  reason,  the  aperture  plane  is  zero  padded 
to  create  a  2 N  x  2 N  grid  prior  to  the  FFT  operation. 

It  is  advantageous  to  position  the  optical  axis  at  the  center  of  the  windowed  region 
in  the  image  plane.  Symmetry  in  positive  and  negative  tilt  effects,  for  instance,  is  best 
utilized  when  the  projection  window  is  split  evenly  along  the  optical  axis.  For  this  reason, 
the  optical  axis  is  centered  within  the  [0, 0]  grid  location  when  the  image  plane  window 
length  is  odd.  When  the  window  length  is  even,  the  optical  axis  is  located  at  [—0.5,  —0.5]. 
For  the  case  of  Nyquist  pixel  sizing,  the  half  pixel  shift  in  the  optical  axis  is  accomplished 
by  inserting  ^  of  artificial  x  and  y-tilt  in  the  aperture.  Recall  the  discussion  in  Section 


4.3  and  substitute  —1/2  pixels  for  An  in  (4.33): 


Ao  = 


-\Rp  A£ 

V  ' 


(8.7) 
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Substituting  the  Nyquist  sampled  image  plane  pixel, 


=  -AL 

2 Dp  5 


yields: 


~\RP 


A  / 
2Dp 


2/ 


-A 

Te' 


Converting  A2  to  units  of  radians  reveals  that  an  input  of  ^  radians  of  tilt  is  required  to 
shift  the  image  by  —0.5  pixels.  Examples  of  the  zero  padded  aperture  mask  and  diffraction 
limited  point  spread  functions  for  even  and  odd  length  projection  windows  are  provided  in 
Figure  [8T3| 


Figure  8.3  (Left)  Zero  padded  aperture  mask  Wz [n] .  (Center)  Diffraction  limited  PSF: 

entire  image  plane  after  performing  FFT.  (Right)  Diffraction  limited  PSF: 
windowed  image  plane.  (Top)  Odd  N\y.  (Bottom)  Even  %. 


The  expected  image  I  is  combined  with  noise  in  the  CCD.  The  details  of  the  CCD 
noise  model  are  similar  to  Cain’s  tilt  estimator  analysis  [1] .  CCD  noise  may  be  categorized 
as  either  signal  dependent  or  independent.  Signal  dependent  noise  includes  all  random 
light-matter  interaction  and  is  modeled  as  independent  Poisson  random  processes  within 
each  detector  pixel.  This  noise  term,  commonly  referred  to  as  shot  noise  or  photon  noise, 
becomes  the  dominant  noise  contribution  under  high  light  conditions.  There  are  many 
contributors  to  signal  independent  noise.  In  this  simulation,  signal  independent  noise  will 
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be  comprised  of  read  out  noise  and  A/D  conversion  noise.  Signal  independent  noise  becomes 
the  dominant  noise  contribution  under  low  SNR  conditions.  The  read  out  noise  term  is 
meant  to  include  effects  such  as  clock  noise,  camera  noise,  and  amplifier  buffer  noise.  The 
A/D  conversion  noise  is  the  result  of  a  scaling  and  flooring  operation.  Assuming  that  the 
voltage  step  size  in  the  A/D  conversion  process  is  equal  to  the  voltage  associated  with  a 
single  photon  detection,  the  scaling  factor  is  unity  and  the  A/D  conversion  is  modeled  as 
a  simple  flooring  operation.  Finally,  the  detected  signal  is  biased  with  the  read  out  noise 


variance  as  suggested  by  the  discrete  model  in  (3.67)  and  any  negative  counts  are  set  to 
0.  Condensing  this  description  into  a  convenient  mathematical  form,  the  detected  image 
projection  may  be  described  by: 


v  (du)  =  max  (0,  floor  [v  (Poisson  {Iu})  +  nro]  +  uro) ,  (8-8) 

where  nro  =  NyNw  length  vector  of  nro  noise,  (8.9) 

nro  =  A7(0,crro),  (8.10) 

and  crro  =  NvN\y  length  vector  of  aro.  (8-11) 


The  Sensor  Algorithm.  The  sensor  algorithm  is  implemented  just  as  it  would  be  in 
an  embedded  application  with  the  exception  of  running  in  the  Matlab  compiler  environment. 
In  essence,  the  only  simulated  portion  of  the  algorithm  is  its  interface  with  the  CCD  data. 
All  input  data  are  generated  from  the  models  described  in  previous  sections.  Strategies  for 
evaluation  of  the  likelihood  metric  and  maximizing  the  likelihood  were  described  in  Chapter 
|5j  All  parameter  estimates  are  calculated  in  series  in  the  computer  simulation.  However, 
a  real  system  could  compute  the  tilt  estimates  in  parallel.  Such  a  system  would  be  more 
complex,  but  would  reduce  computation  time  and  increase  the  sensor  bandwidth. 


8.2  Sensor  Performance 

The  performance  plots  in  this  section  serve  to  qualify  the  use  of  minimum  CRLB  as  a 
basis  for  design  variable  selection  and  to  compare  the  performance  of  the  projection  curva¬ 
ture  sensor  to  the  projection  based  ML  tilt  sensor  and  the  more  common  two-dimensional 
centroid  based  tilt  sensor.  I  will  begin  by  presenting  a  series  of  simulated  performance  plots 
with  overlaid  CRLB  results  from  Section [6. 3 [  Figures [6. 2| through |6.7|  Note  that,  while  the 
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performance  plots  here  should  trend  the  same  as  the  CRLB  plots,  the  simulation  results  will 
differ  from  CRLB  results  for  two  reasons.  The  first  reason  is  that  the  estimator  algorithm 
is  not  efficient.  This  means  that  the  variance  of  the  estimator  algorithm  will  approach  the 
lower  bound  but  will  not  equal  the  lower  bound.  The  second  reason  is  that  the  CRLB  cal¬ 
culations  use  Zernike  based  phase  screens  to  model  atmospheric  phase  turbulence  whereas 
the  simulations  represented  here  use  polar  sampled  Fourier  series  phase  screens.  The  polar 
sampled  phase  screens  contain  higher  order  phase  information  which  tends  to  increase  the 
variance  of  a  low  order  parameter  estimator  like  the  one  used  in  the  ^2-4  sensor. 


Figure  |8~4|  contains  a  plot  of  simulated  residual  MSE  versus  projection  separation  an¬ 
gle.  The  following  design  variable  and  environment  variable  settings  were  used:  Dp  = 
0.07m,  aro  =  2.13  counts,  ro  =  0.05nr,  Lq  =  10m,  Iq  =  0.01m,  K  =  1000  photons, 
±<ja4  =  0.4  radians,  and  Nw  =  14  pixels.  The  minimum  residual  error  occurs  when  the 
separation  angle,  O2  —  #1,  equals  90  degrees.  Based  on  this  result,  the  remaining  simulation 
plots  will  contain  simulated  performance  examples  with  a  projection  separation  angle  of  90 
degrees.  The  CRLB  result  for  whole  plane  projection  separation  angle  is  included  with  an 
independent  y-axis.  The  CRLB  plot  line  is  the  dashed  line  and  corresponds  to  the  y-axis 
on  the  right  hand  side  of  the  plot.  Notice  that  although  the  y-axis  scaling  indicates  differ¬ 
ent  MSE  ranges,  the  separation  angle  at  which  the  simulation  minimum  occurs  equals  the 


separation  angle  at  the  CRLB  minimum  point.  Figure  8.5  demonstrates  the  performance 
versus  projection  angle  0\.  CRLB  continues  to  be  plotted  as  a  dashed  line  scaled  to  the 
right  hand  side  y-axis  for  Figures |8l^through|8.9[  This  example  suggests  that  performance 
is  effectively  invariant  to  starting  projection  angle  provided  that  the  separation  angle  is  90 
degrees.  A  cursory  sampling  over  a  two-dimensional  range  of  start  angles  and  separation 
angles  reveals  that  the  ideal  separation  angle  is  ±90  degrees  and  that  performance  is  effec¬ 
tively  invariant  to  starting  angle  under  any  fixed  separation  angle.  Due  to  this  result,  the 
projection  angle  configuration  for  the  remaining  performance  examples  will  be  limited  to 


cases  where  6  \  =  0°  and  62  =  90°.  Figure  [816]  contains  a  plot  of  residual  wavefront  MSE 
versus  defocus  diversity  for  a  high  SNR  case:  Dp  =  0.07nr,  aro  =  2.13  counts,  ro  =  0.05m, 
Lq  =  10m,  Iq  =  0.01m,  K  =  1000  photons,  and  Nw  =  7  pixels.  The  ideal  diversity  (min¬ 
imum  point)  is  approximately  the  same  as  the  CRLB  minimum:  CRLB  minimum  occurs 


at  about  0.375  radians  versus  the  performance  minimum  at  about  0.4  radians.  Figure  8.7 
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Figure  8.4  ^2-4  estimator  y  versus  separation  angle,  62  —  0\.  ro  =  0.05m,  Lq  =  10m, 

Iq  =  O.Olrn,  aro  =  2.13  counts,  and  N\y  =  14  pixels. 
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Figure  8.5 


Z2-4  estimator  y  versus  projection  angle  6\  given  that  62  =  0\  +  90°. 
ro  =  0.05m,  Lq  =  10m,  Iq  =  O.Olrn,  aro  =  2.13  counts,  and  =  14  pixels. 
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Figure  8.6  ^2-4  estimator  J  versus  ±da4  for  J\  =  1000  photons  per  subaperture. 

ro  =  0.05m,  Lq  =  10m,  Iq  =  O.Olrn,  aro  =  2.13  counts,  and  Nw  =  7  pixels. 


provides  a  plot  of  residual  wavefront  MSE  versus  defocus  diversity  for  a  low  SNR  case.  All 
simulation  parameters  are  the  same  except  the  photon  count:  K  =  100.  The  result  in 
Figure  [R7] suggests  that  the  ideal  choice  of  diversity  is  very  close  to  that  suggested  by  min¬ 
imizing  the  performance  bound:  ideal  6ai  ~  0.1.  Notice  that  the  plot  near  the  minimum  is 
nearly  flat  indicating  that  minimal  MSE  benefit  is  gained  by  applying  diversity  between  the 
image  planes.  This  is  an  indication  that  the  benefits  of  the  sensor  over  tilt  only  estimation 
are  reduced  as  SNR  decreases.  This  characteristic  will  become  more  apparent  in  curvature 


sensor  versus  tilt  only  sensor  comparison  plots  to  follow.  Figures  |8.8|and  8.9  contain  plots 
of  residual  wavefront  MSE  versus  projection  window  length  for  high  and  low  SNR  cases 
respectively.  The  plots  of  MSE  performance  versus  window  length  when  compared  to  the 
CRLB  plots  demonstrate  an  increase  in  the  slope  of  the  performance  curve  as  the  size  of  the 
window  is  decreased  below  10  pixels.  This  result  supports  using  CRLB  as  an  initial  choice 
for  window  length.  R  is  worth  noting,  however,  the  choice  of  Nw  should  not  be  solely 
based  on  minimum  MSE.  The  required  system  bandwidth  may  dictate  a  shorter  window 
length  than  that  which  provides  the  best  performance.  In  cases  where  bandwidth  is  the 
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Figure  8.7  ^2-4  estimator  y  versus  ±<5a4  for  A'  =  100  photons  per  subaperture. 

ro  =  0.05m,  Lq  =  10m,  Iq  =  0.01m,  K  =  100,  aro  =  2.13  counts,  and  Wjy  =  7 
pixels. 
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Figure  8.8  Z2-4  estimator  y  versus  N\y  for  K  =  1000  photons  per  subaperture, 

ro  =  0.05m,  Lq  =  10m,  Iq  =  O.Olrn,  and  aro  =  2.13  counts. 
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Figure  8.9  ^2-4  estimator  ^  versus  ATvy  for  A'  =  100  photons  per  subaperture. 

0.05m,  L0  =  10m,  Iq  =  0.01m,  and  aTO  =  2.13  counts. 
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limiting  factor,  performance  and  CRLB  indicate  a  significant  increase  in  MSE  for  N\y  <  7 
pixels. 


Figure  8.10  provides  a  demonstration  of  sensor  performance  over  a  range  of  SNR 
and  ro  values  with  all  other  operating  variables  fixed:  Dp  =  0.07m,  Lq  =  10m,  Iq  =  0.01m, 
Nw  =  7  pixels  and  (iro  =  2.13  counts.  Figur e |8 . 1 0 1 compares  sensor  performance  when  using 
the  ^2,3  estimator  (dashed  plot  lines)  to  performance  when  using  the  ^2-4  estimator  (solid 
plot  lines).  These  results  suggest  that,  as  SNR  and  ^  decrease,  the  ideal  configuration  for 
the  estimator  requires  less  defocus  diversity,  effectively  converting  the  sensor  from  a  Z2-4 
sensor  into  a  ^2,3  sensor.  This  is  consistent  with  the  results  from  the  CRLB  analysis.  The 
CRLB  suggested  operating  regions  beyond  which  it  is  no  longer  advantageous  to  estimate 
04.  These  threshold  values  can  be  derived  from  the  CRLB  plot  in  Figure |R8|  The  threshold 
values  in  Figure  [8.10  are  slightly  more  conservative  than  lower  bound  threshold  values  in 
Figure  [678]  Of  course,  this  does  not  preclude  the  use  of  CRLB  for  determining  the  limits  on 
the  operating  space,  but  offers  evidence  that  those  limits  set  by  CRLB  may  be  optimistic 
and  should  be  compared  with  simulated  performance. 
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Figure  8.10  Simulated  ^  versus  K  for  several  cases  of  ro-  Dashed  lines  indicate  Z2,3 
estimator  performance.  Solid  lines  indicate  Z2-4  estimator  performance. 

The  final  set  of  performance  figures  offer  a  comparison  of  the  curvature  sensor  perfor¬ 
mance  with  two  existing  techniques.  The  two  techniques  are  the  common  two-dimensional 
centroid  based  estimator  and  a  projection  based  ML  tilt  estimator.  Figures  |8.11  and  8.12 
overlay  the  centroid  performance  against  the  results  from  Figure  |8. 10  The  centroid  sensor 
performance  varies  over  a  range  of  image  plane  window  sizes.  To  demonstrate  how  centroid 
performance  changes  with  increasing  window  size,  colored  plot  lines  {red,  green,  blue,  cyan} 


correspond  to  performance  over  the  set  Nyy  =  {5,  6,  7,  8}  respectively.  Figure  8.13  overlays 
the  projection  based  ML  tilt  estimator  performance  against  the  z?2_4  sensor  results  from 
Figure  |8. 10  Here  the  ML  tilt  sensor  operates  with  minimal  prior  distribution  knowledge: 
the  maximization  algorithm  uses  a  range  of  ±5ci2,3  in  the  search  for  the  likelihood  maxi¬ 
mum.  The  ML  tilt  uses  simulated  long  dwell  images  in  its  expected  image  lookup  table. 


Seeing  the  successful  performance  of  the  Z2-&  sensor  begs  the  question:  why  stop 
estimating  parameters  at  defocus?  The  simple  answer  is  that  estimating  parameters  higher 
than  04  increases  the  residual  MSE  of  the  estimator  beyond  the  performance  set  by  the 
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Figure  8.11  Comparison  of  simulated  centroiding  tilt  estimator  performance  to  the 
and  Z2-4  MAP  estimator  over  a  range  of  tq  and  K  values. 


Figure  8.12  Comparison  of  simulated  centroiding  tilt  estimator  performance  to  the 
and  2^2— 4  MAP  estimator  over  a  range  of  ro  and  I\  values. 
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Figure  8.13  Comparison  of  simulated  projection  based  ML  tilt  estimator  performance  to 
the  ^2,3  and  ^2-4  MAP  estimator  over  a  range  of  vq  and  K  values. 


2>2— 4  estimator  over  the  majority  of  the  simulated  operating  space.  The  performance  plot 


in  Figure  8.14  demonstrates  this  phenomenon.  The  solid  plot  lines  depict  the  residual  MSE 
for  the  ^2-4  sensor  seen  in  previous  figures.  The  dashed  plot  lines  indicate  the  residual 
MSE  for  a  whole  plane  projection  sensor  attempting  to  estimate  parameters  02  through  a$. 
The  design  variables  and  environment  variables  used  in  the  simulation  are  Dp  =  0.07m, 
Lq  =  10m,  Iq  =  0.01m,  N\y  =  7  pixels  and  aro  =  2.13  counts.  Notice  that  the  whole  plane 
^2-6  sensor  either  performs  on  par  with  or  fails  to  perform  better  than  the  ^2-4  sensor 
throughout  the  operational  space.  This  example  suggests  that,  in  order  to  estimate  more 
parameters,  the  sensor  will  require  additional  information  from  the  CCD. 


8.3  Sensitivity  Analysis 

This  section  contains  performance  results  from  a  series  of  simulations  in  which  the 
sensor  was  purposefully  given  erroneous  estimates  of  the  environment  variables:  vq,  Lq,  and 
Iq.  The  intent  is  to  provide  a  demonstration  of  the  sensor’s  robustness  under  the  influence 
of  inaccurate  environment  variable  estimates.  Several  design  variables  remain  constant 
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Figure  8.14  Solid  plot  line  depicts  ^2-4  estimator  performance.  Dashed  plot  line  indi¬ 
cates  ^2-6  performance. 


throughout  each  of  the  figures  to  follow:  Dp  =  0.07m,  N\y  =  7  pixels  and  aro  =  2.13.  Also 
consistent  throughout  is  the  choice  of  plot  line  styles  and  their  corresponding  data  series. 
Solid  lines  indicate  Z2-4  sensor  performance,  while  dashed  lines  indicate  ML  tilt  sensor 
performance.  The  ML  performance  lines  are  included  as  comparison  lines  to  highlight  points 
where  poor  estimates  of  the  environment  variables  negate  the  ^2-4  sensor’s  performance 
advantage.  The  true  value  of  the  environment  variable  for  each  plot  line  is  indicated  by  the 
x  symbol  along  the  solid  line  and  a  O  symbol  along  the  dashed  line.  Each  point  along  the 
solid  performance  lines  represents  an  average  value  from  30  random  cases.  A  confidence 
interval  at  each  plot  point  is  indicated  by  a  pair  of  triangles:  one  pointing  upward  for  +la 
and  one  pointing  downward  for  —  la. 


Figure  |8.15|  demonstrates  ro  sensitivity  in  high  SNR,  K  =  1000  photons,  at  three 
locations  in  the  operating  space.  The  true  values  of  the  parameter  ro  =  {0.04nr,  0.05m, 
0.1m}  correspond  to  the  three  pairs  of  plot  lines.  As  indicated  by  the  x-axis  labeling,  the 
estimates  of  ro  range  from  0.02m  to  0.2m.  However,  the  entire  range  is  not  tested  for  each 
ro  case.  The  range  of  test  values  is  selected  based  on  the  true  value  of  ro-  Figure  |8T6 
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Figure  8.15  Solid  lines  indicate  residual  MSE  versus  ro  estimate.  Dashed  lines  represent 
the  ^2,3  ML  estimator  performance  threshold.  The  true  value  of  ro  is  indi¬ 
cated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate  ±1  a. 
K  =  1000. 


provides  the  same  ?’o  sensitivity  analysis  at  low  SNR:  K  =  200  photons.  Figure  [8.17  shows 
Lq  sensitivity  in  high  SNR:  K  =  1000  photons.  The  true  Lq  value  is  set  at  10m  while  the 
estimate  of  Lq  is  in  the  set  {lm,  10m,  100m}.  Just  as  in  the  ro  analysis  figures,  the  solid 
line  indicates  Z2-4  performance  while  the  dashed  line  indicates  ML  tilt  performance.  The 
true  value  of  Lq  for  each  performance  line  is  indicated  by  the  X  symbol  along  the  solid  line 
and  a  O  along  the  dashed  line.  Figure |8T8  provides  the  same  Lq  sensitivity  analysis  at  low 
SNR:  I\  =  200  photons.  Figure  |8.19|  demonstrates  Iq  sensitivity  in  high  SNR:  K  =  1000 
photons.  The  true  Iq  value  is  set  at  O.Olrn  while  the  estimate  of  Iq  is  in  the  set  {0m,  O.Olrn, 


0.1m}.  Figure  8.20  provides  the  same  Iq  sensitivity  analysis  at  low  SNR:  K  =  200  photons. 


The  results  of  the  sensitivity  analysis  demonstrate  that  the  sensor  performance  de¬ 
pends  on  the  ro  estimate  more  so  than  the  estimates  of  Lq  and  Iq.  In  fact,  the  sensor’s 
performance  is  nearly  invariant  to  Lq  and  Iq  estimates.  The  simulated  cases  indicate  that 
when  Lq  and  Iq  are  unknown,  the  best  course  of  action  is  to  overestimate  Lq  (set  Lq  >  100m) 
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Figure  8.16  Solid  lines  indicate  residual  MSE  versus  ro  estimate.  Dashed  lines  represent 
the  ^2,3  ML  estimator  performance  threshold.  The  true  value  of  ro  is  indi¬ 
cated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate  ±1  a. 
K  =  200. 

and  set  the  Iq  estimate  equal  to  zero.  The  sensitivity  to  ro  is  most  evident  in  low  SNR  and 
at  high  It  is  worthwhile  to  note  that,  although  its  performance  is  degraded,  the  Z2-4 
sensor  performs  on  par  with  or  better  than  the  ML  tilt  sensor  over  the  range  of  simulated 
ro  estimates. 

8.4  Summary 

This  chapter  began  with  a  brief  description  of  the  Z2-4  sensor  simulation.  Simulated 
performance  plots  were  provided  in  order  to  demonstrate  the  search  for  ideal  design  variable 
settings,  and  to  provide  an  estimate  of  sensor  performance  over  a  typical  range  of  the 
operating  space.  Simulated  performance  examples  used  to  predict  ideal  design  variable 
settings  show  that  the  computer  simulation  and  the  calculated  CRLB  results  compliment 
each  other.  The  fact  that  the  CRLB  results  and  simulated  results  derive  similar  conclusions, 
increases  confidence  in  the  simulation  accuracy  and  suggests  that  CRLB  is  an  adequate  tool 
for  making  preliminary  design  choices.  A  comparison  of  the  Z2-4  sensor  performance  to 
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Figure  8.17  Solid  lines  indicate  residual  MSE  versus  Lq  estimate.  Dashed  lines  represent 
the  ^2,3  ML  estimator  performance  threshold.  The  true  value  of  Lq  is 
indicated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate 
±lo-.  K  =  1000. 
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Sensor  Residual  MSE  [rad 


Lq  Estimate  [m] 


Figure  8.18  Solid  lines  indicate  residual  MSE  versus  Lq  estimate.  Dashed  lines  represent 
the  ^2,3  ML  estimator  performance  threshold.  The  true  value  of  Lq  is 
indicated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate 
±lo-.  K  =  200. 
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Figure  8.19  Solid  lines  indicate  residual  MSE  versus  l o  estimate.  Dashed  lines  represent 
the  Z2t3  ML  estimator  performance  threshold.  The  true  value  of  Iq  is  indi¬ 
cated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate  ±1  a. 
I<  =  1000. 

the  centroid  and  tilt-only  ML  estimator  revealed  that,  under  the  simulated  conditions,  the 
Z2-4  sensor  performance  is  on  par  with  or  better  than  the  other  sensor  designs  for  cases 
where  the  average  photon  count  is  greater  than  100  photons  per  subaperture  and  the  ratio 
—  is  greater  than  0.5. 
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Figure  8.20  Solid  lines  indicate  residual  MSE  versus  Iq  estimate.  Dashed  lines  represent 
the  .^2,3  ML  estimator  performance  threshold.  The  true  value  of  Iq  is  indi¬ 
cated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate  ±1  a. 
K  =  200. 
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9.  The  Z2~  10  Wavefront  Sensor 

This  chapter  provides  the  details  and  simulated  performance  results  for  a  curvature  sensor 
designed  to  estimate  Zernike  polynomial  coefficients  a2  through  aio-  To  aid  with  coefficients 
higher  than  a 4,  half  plane  image  projections  are  used.  This  sensor  provides  estimates  in  two 
stages  as  did  the  Z2_4  sensor.  The  tilt  estimates  are  formed  in  the  first  stage  and  all  higher 
order  estimates  are  formed  in  the  second  stage  making  the  Z2-10  highly  parallelizable.  If  all 
higher  order  estimates  are  computed  in  parallel,  the  only  additional  complexity  over  the  Z2-4 
sensor  comes  from  managing  half  plane  image  projections  which  increases  the  computational 
complexity  by  about  67%.  The  sections  to  follow  outline  the  major  differences  between 
the  Z2-4  sensor  and  the  Z2-10  sensor  and  provide  an  analysis  of  Z2-10  sensor  simulated 
performance. 


9.1  Image  Projections 

The  Z2_io  sensor  design  uses  half  plane  image  projections.  The  half  plane  image 
projection  is  defined  as  a  concatenated  pair  of  projections.  The  windowed  region  in  the 
image  plane  is  divided  between  the  two  projections.  The  windowed  region  is  divided  evenly 
if  the  window  length  is  even.  The  odd  center  row  of  pixels  is  summed  into  the  second 


half  image  projection  when  the  window  length  is  odd.  Figure  9.1  provides  a  diagram  of 
the  ^2-10  image  projections.  The  projection  operations  included  in  the  Z2-io  estimator 
expressions  are: 


{(i,jv1),jvw}v  (d*a)  > 

(9.1) 

{{N2,Nw)}V  (Di,0i)  I 

(9.2) 

{(l,7Vi),AV}v  (D1,0,D2,90)  , 

(9.3) 

{( n2,nw)}V  (Di,o,D2,9o)  >  and 

(9.4) 

{(l,JVi),(JV2,JVw)}V  (Di,0,D2,90)  , 

(9.5) 

where  N\ 

=  floor  (lVjy/2), 

(9.6) 

and  N2 

=  floor  (Nw/2)  +  1. 

(9.7) 
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Figure  9.1  Diagram  of  the  ^2-10  sensor’s  half  plane  image  projection  operation  for  6x6 
pixel  and  5x5  pixel  windows. 


Note  that  each  of  the  half  plane  projections  are  read  from  the  CCDs  only  once.  The  projec¬ 
tion  operations  are  expressed  as  unique  operations  here  solely  for  mathematical  convenience. 
The  tilt  estimator  and  the  (Z4  estimator  still  require  whole  plane  projections.  The  whole 
plane  projections  are  constructed  from  the  sum  of  half  plane  projections. 


Due  to  the  half  plane  projection  requirement,  the  ^2-10  sensor  must  process  more 


data  than  the  ^2-4  sensor.  The  chart  in  Figure  9.2  describes  the  relative  difference  in 
computational  complexity  of  estimating  02  through  a\o  based  on  the  difference  in  projection 
data  and  the  level  of  parallel  or  serial  computation.  Thus,  any  design  using  half  plane 
image  projections  accepts  the  engineering  trade-off  of  increased  information  in  exchange 
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Parameter(s)  estimated 


Figure  9.2  Relative  computational  complexity  between  serial  and  parallel  estimator  con¬ 
figurations. 


for  increased  read  noise,  increased  CCD  read  time,  and  increased  computation  time.  The 
increased  information  provides  the  ability  to  estimate  Zernike  polynomials  beyond  Z\. 


9.2  Likelihood  Expressions 

The  sensor  hardware  provides  four  half  plane  image  projections  from  each  subaper¬ 


ture.  Figure  9.3  diagrams  the  read  out  and  flow  of  the  four  image  projections  through 
the  estimator  algorithm.  Each  likelihood  expression  requires  four  inputs:  a  detected  image 
projection,  a  reference  image  projection,  a  set  of  atmospheric  parameter  estimates  and  an 
estimate  of  the  current  photon  level,  K.  Solid  lines  indicate  the  flow  of  real  time  detected 
image  projections.  Dashed  lines  indicate  information  used  for  reference  image  projections 
which  are  computed  and  stored  into  lookup  tables  during  sensor  calibration.  The  heavily 
outlined  blocks  in  Figure  [973]  indicate  locations  where  a  likelihood  expression  is  evaluated. 
Each  parameter  is  estimated  independently.  The  tilt  parameters  must  be  estimated  first. 
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Figure  9.3  Diagram  shows  the  flow  of  image  projections  through  the  Z2_io  estimation 
algorithm. 


The  tilt  likelihood  expressions  used  in  the  Z2-10  estimator  are  given  by: 


L 


map2 


{M)  —  E  [{(l,7Vi),JVvi/}v;  (Dl.o)  +  {(N2,Nw)}vl  (Di,q)  +  ofo]  x 


ln  {  l(l,Arw)}V«  (i23ll,0  [^2])  +  &ro}  ~ 

A2 

{(l,Nw)}vl  (i23ll,0  [^2])  -  ^2- 


(9.8) 


L 


maps 


(40 


E  [{(l,Ah),iVwTVZ  (D2,9o)  +  {(N2,Nw)}vl  (D2,9o)  +  &ro\  x 
l 

ln  {  {A,Nw)}Vl  (L23l2,90  [^3])  +  &ro}  ~ 

A2 

{(l,Nw)}Vl  (i23l2,90  [^3])  -  -^2-  (9-9) 
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Recall  that  each  parameter  estimate  is  formed  by  maximizing  the  likelihood  with  respect 
to  the  parameter: 


max  Lmap  (Ax) 
Ax 


Ax— Ax 


(9.10) 


Unlike  higher  order  parameters,  the  tilt  estimates  are  computed  from  whole  plane  projec¬ 
tions.  Therefore,  the  two  half  plane  projections  are  summed  to  create  a  single  whole  plane 
projection  prior  to  evaluation  of  the  likelihood  expression.  Once  tilt  estimates  are  avail¬ 
able,  all  higher  order  parameters  may  be  estimated  using  the  tilt  estimates  as  indices  into 
the  tables  of  preregistered  image  projections.  The  estimator  selects  the  preregistered  Z4 
through  Z 10  projections  with  the  closest  matching  pair  of  tilt  values: 


AA2,  round 


(9.11) 


The  likelihood  for  Z4  is  computed  from  whole  plane  projections  much  like  the  tilt  likelihood: 


L'map4  w  =  £  [{(i,JVi),JV}Vj  (D  1,0?  -^2,9o)  “1“  {(A^2,-/Vvy)}^^  (Dl>0,  -^2,90  )  ^Yo]  ^ 


In  |  {(l,Nw)}vl  (l4Ii,0  A2,A3,  A4  ,L4l2,90  A2,A3,  A4  ^  +  CT^o} 


{(1  ,Nw)}Vl  (L4ll,0 


A-2 ,  A3,  A4 


,L 4  -1-2,90 


A-2 ,  A3,  A4 


A2 

_U_ 

2aj ' 


(9.12) 


The  likelihood  for  all  higher  order  estimates  is  given  by: 
Lmapx(Ax )  =  ^2  [{(l,iVi),(iV2,i\%)}V*  (Dl,o,D2,9o)  +  Crfo]  X 


In  |  {(l,Ni),(N2,Nw)}vl  (lxIi,0  A2,A3,  A 


iLx  -1-2,90 


A2,  A3,  Ax 


{(1,N i),(N2,Nw)}^1  I  Lx -1-1,0 


A 2,  A3,  Ax 


;Lx  I: 


2,90 


A-2,  A3,  Ax 


)  +  ^ro}  - 

2a2' 

(9.13) 


9.3  Sensor  Performance 

This  section  contains  several  examples  of  Z2-10  sensor  simulated  performance.  The 
simulated  performance  will  be  used  for  design  variable  selection  and  it  will  be  compared 
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with  the  projection  based  ML  tilt  sensor  and  the  more  common  two-dimensional  centroid 
based  tilt  sensor.  To  compare  the  results  of  using  CRLB  versus  simulation  performance  as 
a  metric  for  design  variable  selection,  I  will  begin  by  overlaying  the  series  of  plots  found  in 


Section  6.3  onto  simulated  sensor  performance.  Just  as  with  the  Z2-4  sensor  simulation, 


the  performance  plots  here  should  trend  the  same  as  the  CRLB  plots  but  will  differ  from 
CRLB  results  in  overall  magnitude.  The  discrepancy  between  CRLB  calculations  and 
the  simulation  is  attributable  to  variance  in  the  estimator  and  the  difference  between  the 
random  phase  generation  method  used  in  the  CRLB  calculations  versus  the  method  used 
in  simulation. 


Figure  9.4  Z2-10  estimator  ^  versus  separation  angle,  62  —  0\.  ro  =  0.05m,  Lq  = 

10m,  Iq  =  0.01m,  and  N\y  =  14  pixels. 


N 


Figure  9.4  contains  a  plot  of  residual  MSE  versus  projection  separation  angle.  The 
following  design  variable  and  environment  variable  settings  were  used:  Dp  =  0.07m,  crro  = 
2.13  counts,  ro  =  0.05m,  Lq  =  10m,  Iq  =  0.01m,  K  =  1000  photons,  ±<Sa4  =  0.55  radians, 
and  N\y  =  14  pixels.  The  CRLB  result  for  half  plane  projection  separation  angle  is  included 
with  an  independent  y-axis.  The  CRLB  plot  line  is  the  dashed  line  and  corresponds  to 
the  y-axis  on  the  right  hand  side  of  the  plot.  The  simulated  performance  results  in  Figure 


9.4  agree  with  the  CRLB  results  indicating  that  the  ideal  choice  of  separation  angle  is  90 
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Sensor  CRLB  for  Residual  MSE  [rad 


degrees.  Using  this  result,  Figure  [975] demonstrates  simulated  performance  versus  9 \  for  a 


separation  angle  of  90  degrees.  Figure  9.5  echoes  the  results  seen  in  CRLB  plots  for  both 


CCD  Starting  Angle  [degrees] 


C\J 


O 


Figure  9.5  Z2-10  estimator  ^ P |  j  versus  projection  angle  9\  given  that  62  =  9\  +  90°. 

tq  =  0.05m,  Lq  =  10m,  Iq  =  0.01m,  and  Nyy  =  14  pixels. 


the  Z2-A  and  ^2-10  sensors  (Figures  |6.3|  and  6.10)  and  in  the  performance  results  for  the 


^>2— 4  sensor  (Figure  8.5):  residual  MSE  is  effectively  invariant  over  the  range  of  9\  given 
that  the  difference  between  projection  angles  is  90  degrees.  Note  that  the  CRLB  continues 


to  be  represented  by  a  dashed  plot  line  scaled  to  the  right  hand  side  y-axis  for  Figures  9.5 


through  |9.9|  Figure  [976]  contains  a  plot  of  residual  wavefront  MSE  versus  defocus  diversity 
for  a  high  SNR  case:  Dp  =  0.07nr,  aro  =  2.13  counts,  ro  =  0.05nr,  Lq  =  10m,  Iq  =  0.01m, 


K  =  1000  photons,  and  %  =  9  pixels.  Performance  in  Figure  9.6  suggests  that  the  ideal 


diversity  is  approximately  0.55  radians,  which  is  the  same  as  the  CRLB  result.  Figure  9.7 


provides  a  plot  of  residual  wavefront  MSE  versus  defocus  diversity  for  a  low  SNR  case.  The 
simulation  inputs  are  the  same  as  those  used  in  the  previous  example  except  the  photon 


count:  K  =  100.  The  result  in  Figure  9.7  corresponds  to  the  CRLB  result  once  again. 
Performance  suggests  that  the  ideal  choice  of  diversity  at  low  SNR  is  approximately  0.3 
radians.  CRLB  suggests  that  the  ideal  diversity  is  about  0.4  radians.  Figures  |9.8|  and 


9.9  contain  plots  of  residual  wavefront  MSE  versus  projection  window  size  for  high  and  low 
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Sensor  CRLB  for  Residual  MSE  [rad 
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Figure  9.6  ^2-10  estimator  ^  versus  P5aA  for  K  =  1000  photons  per  subaperture, 

ro  =  0.05nr,  Lq  =  10m,  Iq  =  0.01m,  and  IV^y  =  9  pixels. 
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Figure  9.7  ^2-10  estimator  (PJ  )  versus  ±Ja4  for  K  =  100  photons  per  subaperture 


ro  =  0.05m,  Lq  =  10m,  Iq  =  0.01m,  K  =  100,  and  Wn  =  9  pixels. 
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SNR  cases  respectively.  The  plots  of  simulated  performance  versus  window  size  indicate 
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Figure  9.8  Z2-10  estimator  \P$)  versus  N\y  for  K  =  1000  photons  per  subaperture. 


that  sensor  performance  is  best  when  the  window  size  is  about  12  x  12  pixels.  Required 
system  bandwidth  may  dictate  a  shorter  window  length  than  that  which  provides  ideal 
performance.  In  cases  where  system  bandwidth  requires  reading  out  fewer  pixels,  it  is 
important  to  note  that  the  CRLB  and  simulated  performance  plots  indicate  a  significant 
reduction  in  performance  for  projection  vector  lengths  shorter  than  8  pixels. 


Figures  9.10  and  9.11  provide  a  demonstration  of  the  ^2-10  sensor  performance  over 
a  range  of  SNR  and  tq  values  with  all  other  operating  variables  fixed:  Dp  =  0.07m,  Lq  = 
10m,  Iq  =  O.Olrn,  N\y  =  9  pixels  and  aro  =  2.13  counts.  The  solid  plot  lines  represent 
^2—10  performance.  The  dashed  plot  lines  represent  tilt  only  performance.  The  tilt 
only  performance  lines  are  the  same  dashed  plot  lines  found  in  the  whole  plane  projection 


performance  plot  in  Figure  8.10  The  tilt  only  performance  lines  are  included  in  this  figure 


in  order  to  demonstrate  the  lower  bounds  in  the  operating  space  on  SNR  and  the  ratio  — 


r  0 


beyond  which  it  is  no  longer  advantageous  to  implement  the  higher  order  estimator.  Recall, 
for  instance,  that  the  tilt  only  performance  shown  here  offers  the  additional  advantage  of 
using  a  7  pixel  window  versus  N\y  =  9  pixels.  When  compared  with  the  performance  of  the 
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Sensor  Residual  MSE  [rad 


Figure  9.9  Z2-10  estimator  ^  versus  N\y  for  K  =  100  photons  per  subaperture. 


Figure  9.10  Simulated  ^  versus  K  for  several  cases  of  tq.  Dashed  lines  indicate  .^2,3 

sensor  (whole  plane  projections)  performance.  Solid  lines  indicate  Z2-10 
sensor  (half  plane  projections)  performance. 


9-10 


Sensor  CRLB  for  Residual  MSE  [rad 


Figure  9.11  Simulated  ^  versus  K  for  several  cases  of  r$.  Dashed  lines  indicate  .^2,3 
sensor  (whole  plane  projections)  performance.  Solid  lines  indicate  Z2-10 
sensor  (half  plane  projections)  performance. 


^2,3  sensor,  it  is  easy  to  see  the  advantages  and  disadvantages  of  estimating  higher  order 
parameters  .  The  advantage  of  the  ^2-10  sensor  increases  as  SNR  increases.  The  ^2-10 
sensor  does  not  provide  significant  improvement  over  the  tilt  only  whole  plane  projection 
sensor  when  K  <  200  photons  per  subaperture. 


The  next  set  of  performance  figures  offer  a  comparison  of  the  ^2-10  curvature  sensor 
performance  with  the  projection  based  ML  tilt  estimator  and  the  two-dimensional  centroid 
based  estimator.  Figures [9. 12  and  9.13|overlay  the  centroid  performance.  There  is  no  direct 
comparison  of  window  length  in  the  centroiding  case  with  a  projection  length  because  the 
centroid  estimator  uses  two-dimensional  data.  Additionally,  the  ideal  centroiding  window 
size  changes  over  the  operational  space.  For  these  reasons,  I  have  included  several  centroid 
performance  plot  lines  for  each  case  of  r$  to  demonstrate  centroid  performance  at  different 
window  sizes.  The  set  of  colors  {red,  green,  blue,  cyan}  correspond  to  the  window  sizes 
in  the  set  {5,6,7,  8}  respectively.  Figures  9.14  and  9.15|  overlay  the  ML  tilt  estimator 
performance.  The  ML  performance  lines  shown  here  are  the  same  lines  from  Figure  . 
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Figures  |9.12  through  9.15|  reveal  that  Z2-10  sensor  performance  is  on  par  or  better  than 
both  tilt  estimators  throughout  the  entire  operational  range. 


Figure  9.12  Comparison  of  simulated  centroiding  tilt  estimator  performance  to  the  ^2-10 
estimator  over  a  range  of  r 0  and  K  values. 


9-4  Sensitivity  Analysis 

This  section  contains  performance  results  from  a  series  of  simulations  in  which  the 
sensor  was  purposefully  given  erroneous  estimates  of  the  environment  variables:  ro,  Lq,  and 
Iq.  The  intent  is  to  provide  a  demonstration  of  the  sensor’s  robustness  under  the  influence 
of  inaccurate  environment  variable  estimates.  Several  design  variables  remain  constant 
throughout  each  of  the  figures  to  follow:  Dp  =  0.07m,  N\y  =  9  pixels  and  aro  =  2.13.  Also 
consistent  throughout  is  the  choice  of  plot  line  styles  and  their  corresponding  data  series. 
Solid  lines  indicate  ^2-10  sensor  performance,  while  dashed  lines  indicate  ML  tilt  sensor 
performance.  The  ML  performance  lines  are  included  as  comparison  lines  to  highlight  points 
where  poor  estimates  of  the  environment  variables  negate  the  ^2-10  sensor’s  performance 
advantage.  The  true  value  of  the  environment  variable  for  each  plot  line  is  indicated  by  the 
x  symbol  along  the  solid  line  and  a  Q  symbol  along  the  dashed  line.  Each  point  along  the 


9-12 


Figure  9.13  Comparison  of  simulated  centroiding  tilt  estimator  performance  to  the  Z2-10 
estimator  over  a  range  of  ro  and  K  values. 


Figure  9.14  Comparison  of  simulated  projection  based  ML  tilt  estimator  performance  to 
the  Z2-10  estimator  over  a  range  of  r 0  and  K  values. 
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Figure  9.15  Comparison  of  simulated  projection  based  ML  tilt  estimator  performance  to 
the  Z2-10  estimator  over  a  range  of  ro  and  K  values. 


solid  performance  lines  represents  an  average  value  from  30  random  cases.  A  confidence 
interval  at  each  plot  point  is  indicated  by  a  pair  of  triangles:  one  pointing  upward  for  +la 
and  one  pointing  downward  for  —  la,  where  a  represents  the  30  sample  standard  deviation. 


Figure  9.16  demonstrates  ro  sensitivity  in  high  SNR,  K  =  1000  photons,  at  three 
locations  in  the  operating  space.  The  true  values  of  the  parameter  ro  =  {0.04m,  0.05m, 
O.lrn}  correspond  to  the  three  pairs  of  plot  lines.  As  indicated  by  the  x-axis  labeling,  the 
estimates  of  ro  range  from  0.02m  to  0.2m.  However,  the  entire  range  is  not  tested  for  each 
ro  case.  The  range  of  test  values  is  selected  based  on  the  true  value  of  ?’q.  Figure  |9T7 


provides  the  same  ro  sensitivity  analysis  at  low  SNR:  K  =  200  photons.  F igure  [9. 18|  shows 
Lq  sensitivity  in  high  SNR:  K  =  1000  photons.  The  true  Lq  value  is  set  at  10m  while 
the  estimate  of  Lq  is  in  the  set  {lm,  10m,  100m}.  The  three  pairs  of  lines  correspond 
to  the  previous  ro  cases:  0.04m,  0.05nr,  and  O.lrn  respectively.  Just  as  in  the  ro  analysis 
figures,  the  solid  line  indicates  Z2-4  performance  while  the  dashed  line  indicates  ML  tilt 
performance.  The  true  value  of  Lq  for  each  performance  line  is  indicated  by  the  X  symbol 


along  the  solid  line  and  a  Q  along  the  dashed  line.  Figure  9.19  provides  the  same  Lq 
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Sensor  Residual  MSE  [rad 


Figure  9.16  Solid  lines  indicate  Z2-10  residual  MSE  versus  ro  estimate.  Dashed  lines 
represent  the  .^2,3  ML  estimator  performance  threshold.  The  true  value  of 
?’o  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate 
±la.  K  =  1000. 
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Sensor  Residual  MSE  [rad 


Figure  9.17  Solid  lines  indicate  Z2-10  residual  MSE  versus  ro  estimate.  Dashed  lines 
represent  the  .^2,3  ML  estimator  performance  threshold.  The  true  value  of 
?’o  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate 
±la.  K  =  200. 
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Sensor  Residual  MSE  [rad 


Lq  Estimate  [m] 

Figure  9.18  Solid  lines  indicate  residual  MSE  versus  Lq  estimate.  Dashed  lines  represent 
the  ^2,3  ML  estimator  performance  threshold.  The  true  value  of  Lq  is 
indicated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate 
Tier.  K  =  1000. 
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sensitivity  analysis  at  low  SNR:  K 


200  photons.  Figure  9.20  demonstrates  Iq  sensitivity 


L  Estimate  [m] 

Figure  9.19  Solid  lines  indicate  Z2-10  residual  MSE  versus  Lq  estimate.  Dashed  lines 
represent  the  ^2,3  ML  estimator  performance  threshold.  The  true  value  of 
Lq  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line).  Triangles  indicate 
±l(j.  K  =  200. 


in  high  SNR:  K  =  1000  photons.  The  true  Iq  value  is  set  at  0.01m  while  the  estimate  of  Iq 
is  in  the  set  {0m,  O.Olrn,  O.lnr}.  Once  again  the  three  pairs  of  plot  lines  correspond  to  the 


three  ro  cases.  Figure  9.21  provides  the  same  Iq  sensitivity  analysis  at  low  SNR:  K  =  200 
photons. 


The  results  of  the  sensitivity  analysis  demonstrate  that  the  sensor  performance  de¬ 
pends  on  the  ro  estimate  more  so  than  the  estimates  of  Lq  and  Iq.  In  fact,  the  sensor’s 
performance  is  nearly  invariant  to  Lq  and  Iq  estimates.  The  simulated  cases  indicate  that 
when  Lo  and  ^0  are  unknown,  the  best  course  of  action  is  to  overestimate  Lq  (set  Lq  >  100m) 
and  set  the  Iq  estimate  equal  to  zero.  The  sensitivity  to  ro  is  most  evident  in  low  SNR  and 
at  high 
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Sensor  Residual  MSE  [rad 


I  Estimate  [m] 


Figure  9.20 


Solid  lines  indicate  ^2-10  residual  MSE  versus  Iq  estimate.  Dashed  lines 
represent  the  .^2,3  ML  estimator  performance  threshold.  The  true  value  of 
lo  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line) .  Triangles  indicate 
±lo-.  K  =  1000. 
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Sensor  Residual  MSE  [rad 


Figure  9.21 


Solid  lines  indicate  ^2-10  residual  MSE  versus  Iq  estimate.  Dashed  lines 
represent  the  .^2,3  ML  estimator  performance  threshold.  The  true  value  of 
lo  is  indicated  by  an  x  (solid  line)  or  circle  (dashed  line) .  Triangles  indicate 
±lo-.  K  =  200. 
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9.5  Summary 

This  chapter  began  with  a  brief  description  of  the  Z2-10  sensor  and  the  half  plane 
image  projection.  Simulated  performance  plots  were  provided  in  order  to  demonstrate  the 
search  for  ideal  design  variable  settings,  and  to  provide  an  estimate  of  sensor  performance 
over  a  typical  range  of  the  operating  space.  Performance  was  compared  to  the  CRLB 
results.  As  was  the  case  with  the  Z2-4  sensor,  simulated  Z2-10  sensor  performance  and 
the  calculated  CRLB  results  compliment  each  other.  Once  again,  the  similarity  between 
simulation  and  CRLB  results  increases  confidence  in  the  simulation  accuracy  and  suggests 
that  CRLB  is  an  adequate  tool  for  making  preliminary  design  choices  for  the  ^2-10  sensor. 
A  comparison  of  the  Z2-10  sensor  performance  to  the  centroid  and  tilt  only  ML  estimator 
revealed  that,  under  the  simulated  conditions,  the  ^2-10  sensor  performance  is  on  par  with 
or  better  than  the  other  sensor  designs  for  cases  where  the  SNR  is  greater  than  200  photons 
per  subaperture. 
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10.  Conclusion 


The  temperature  and  pressure  throughout  the  atmosphere  change  constantly.  These  factors 
contribute  significantly  to  the  index  of  refraction  which  affects  the  way  that  light  propagates 
through  the  atmosphere.  In  optical  imaging  systems,  the  atmosphere’s  turbulent  nature 
has  the  effect  of  introducing  an  extended  lens  with  randomly  fluctuating  aberrations  into  the 
optical  path.  Unfortunately,  these  random  aberrations  often  represent  the  limiting  factor 
in  image  resolution  for  a  given  optical  system.  Wherever  applications  require  imaging 
through  the  atmosphere,  adaptive  optics  (AO)  systems  offer  the  ability  to  improve  the 
quality  of  imaging.  Adaptive  optics  improve  image  quality  by  sensing  and  compensating 
for  the  random  phase  fluctuations  injected  by  the  atmosphere.  The  device  responsible 
for  detecting  atmospheric  phase  aberrations  is  called  a  wavefront  sensor.  The  focus  of 
this  research  is  the  design  and  simulation  testing  of  two  new  options  for  wavefront  sensing. 
The  AO  wavefront  sensors  simulated  in  this  research  estimate  lower  order  phase  modes  in 
the  aperture  from  intensity  measurements  in  the  image  plane.  The  difficulties  involved  in 
detecting  wavefront  phase  from  image  intensity  include:  distinguishing  lower  order  modes 
amid  higher  order  modal  effects  and  maintaining  performance  under  low  SNR  and  the 
influence  of  CCD  read  noise.  This  research  demonstrates  two  projection  based  sensor 
designs  which  offer  improvements  in  each  of  these  categories  over  existing  sensor  designs. 

10.1  Research  Contributions 

The  primary  research  contributions  are  the  development  and  simulation  testing  of 
two  projection  based  wavefront  curvature  sensors  for  adaptive  optics.  The  Z2-4  and  Z2-10 
wavefront  sensors  proposed  here  would  replace  the  traditional  physical  devices  used  in  in¬ 
terferometric  or  Hartmann  style  sensing  devices  with  a  beamsplitter,  a  pair  of  projection 
based  CCDs  and  a  computer  processor.  These  wavefront  sensors  are  designed  to  detect 
several  low  order  harmonics  which  comprise  a  large  percentage  of  the  atmospheric  aberra¬ 
tions.  By  improving  tilt  estimates  and  adding  the  ability  to  detect  curvature  modes,  the 
proposed  wavefront  sensors  are  able  to  significantly  reduce  wavefront  residual  mean  squared 
error  and  expand  the  operating  space  of  existing  sensor  designs. 

Due  to  the  rate  at  which  the  atmospheric  turbulence  evolves,  adaptive  optics  systems 
must  operate  using  band  widths  greater  than  several  hundred  hertz.  A  significant  portion  of 
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the  AO  loop  is  dedicated  to  the  control  electronics  and  manipulating  the  deformable  mirror 
leaving  only  a  few  milliseconds  to  solve  the  wavefront  sensing  problem.  CCD  read  out  can 
represent  a  large  portion  of  the  sensor  processing  time  when  processing  periods  are  on  the 
order  of  a  millisecond.  To  save  time  with  CCD  read  out,  previous  research  has  proposed 
compressing  two-dimensional  images  into  vectors  of  pixels  called  image  projections.  The 
vector  based  wavefront  sensor  must  detect  wavefront  phase  from  the  compressed  image 
information.  This  dissertation  demonstrates  that  curvature  modes  up  to  the  10th  Zernike 
polynomial  can  be  estimated  from  image  projections. 

This  research  presents  a  unique  modification  to  existing  image  projection  techniques. 
I  have  developed  two  MAP  estimators:  the  first  of  which  uses  whole  plane  image  projections 
and  the  second  uses  half  plane  image  projections.  The  MAP  estimator  requires  the  wave- 
front  sensor  to  evaluate  and  minimize  a  Bayesian  risk  function.  This  process  can  be  reduced 
to  maximizing  a  likelihood  function  over  multiple  parameters.  Maximizing  the  likelihood 
metric  over  the  multidimensional  parameter  space  inherent  to  the  curvature  estimation 
problem  can  present  a  challenging  task.  I  have  shown  here  that  sufficient  information  ex¬ 
ists  in  the  image  projections  to  estimate  several  lower  order  parameters  independently.  By 
estimating  parameters  independently,  the  multidimensional  problem  is  reduced  to  several 
one-dimensional  minimization  problems.  The  accuracy  and  efficiency  of  the  maximization 
process  can  be  further  enhanced  by  strategically  applying  a  quadratic  curve  fit  through  the 
likelihood.  Under  the  simulated  conditions,  Zernike  tilt  modes  2  and  3  can  be  estimated 
independently.  Once  the  tilt  estimates  are  formed,  curvature  modes  Z4  through  Z\q  can 
be  independently  estimated.  The  proposed  sensors’  performance  in  high  SNR  match  the 
performance  of  a  tilt  only  sensor  while  operating  with  a  larger  ^  ratio.  For  systems 
currently  using  the  tilt  only  sensor,  this  design  offers  the  trade-off  of  maintaining  current 
performance  while  increasing  the  subaperture  size  and  hence  reducing  the  total  number  of 
subapertures  and  the  complexity  of  the  AO  system  or  enjoying  improved  performance  at 
the  same  subaperture  size. 

The  description  of  each  wavefront  sensor  also  includes  a  list  of  key  design  variables  and 
operating  variables.  The  development  of  these  wavefront  sensors  included  the  derivation  of 
a  projection  based  performance  bound.  The  performance  bound  was  shown  to  provide  an 
effective  measure  for  selecting  ideal  design  variable  settings  as  a  function  of  the  operating 
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environment.  As  an  aside  to  prototype  sensor  simulation  and  testing,  this  dissertation 
developed  the  log  polar  sampling  strategy  for  improving  accuracy  and  isotropy  in  extensible 
phase  screen  generation. 

10.2  Future  Work 

This  research  shows  that  it  is  possible  to  improve  wavefront  sensing  performance 
over  existing  techniques  in  specific  simulated  conditions.  The  next  step  may  be  to  prove 
whether  a  hardware  implementation  of  the  sensor  design  can  provide  similar  benefits.  Test¬ 
ing  a  hardware  implementation  of  the  projection  based  wavefront  sensor  with  a  narrowband 
source  would  suffice  to  demonstrate  the  type  of  CCD  read  noise  rejection  and  the  rela¬ 
tive  decrease  in  CCD  read  time  that  could  be  achieved  over  existing  two-dimensional  data 
sensors.  A  true  test  of  the  projection  based  sensor  may  be  limited  by  the  availability  of 
specially  designed  CCD  arrays.  Although  the  technology  exists  to  construct  a  fast,  effi¬ 
cient  projection  based  array,  the  cost  of  producing  CCDs  specifically  designed  for  projection 
based  wavefront  sensing  remains  under  investigation.  In  addition  to  hardware  testing,  the 
simulation  testing  can  be  improved.  Increased  complexity  in  the  atmospheric  simulation 
modeling  might  include  temporal  simulations  using  Taylor’s  frozen  flow  and  multilayered 
atmospheric  models. 

The  sensor  software  algorithm  can  be  modified  independent  of  the  CCD  hardware 
design.  As  such,  projection  based  hardware  research  and  testing  can  begin  while  enhance¬ 
ments  to  the  curvature  sensor  algorithm  continue  to  be  a  subject  of  future  research.  Perhaps 
the  most  limiting  factor  in  the  software  algorithm  is  the  need  for  a  point  source  or  guide 
star  reference.  Section  |4.2|  discussed  techniques  for  estimating  the  object  along  with  the 
unknown  wavefront  phase  using  phase  diversity.  The  drawback  was  that  phase  diversity 
loops  are  computationally  intense.  This  makes  phase  diversity  difficult  to  implement  in  real 
time  AO  applications.  The  bandwidth  problem  suggests  investigating  the  use  of  projections 
in  frequency  domain  MSE  estimators  for  phase  diversity  applications.  For  instance,  the 
Gonsalves  metric  has  been  proven  to  work  for  two-dimensional  image  data,  but  it’s  perfor¬ 
mance  using  image  projection  data  should  be  investigated.  The  Fourier  analysis  required 
for  phase  diversity  applications  is  computationally  expensive.  Image  projections  would 
allow  a  one-dimensional  transform  versus  a  two  dimensional  transform  significantly  reduc- 
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ing  the  amount  of  computation  required  to  implement  the  Gonsalves  metric.  In  addition 
to  reducing  complexity  via  the  one-dimensional  transform,  portions  of  the  Fourier  analysis 
could  be  precomputed  much  like  the  phase  screen  transform  implementation.  If  a  strategic 
set  of  OTF  frequency  points  can  be  determined  which  provide  better  ability  to  estimate 
lower  order  Zernikes  then  the  full  phase  diversity  technique  may  be  traded  for  a  low  order 
Zernike  estimator  capable  of  operating  in  higher  bandwidth  real  time  applications. 

The  sensitivity  analysis  demonstrated  an  interrelationship  between  the  sensor  perfor¬ 
mance  and  the  quality  of  atmospheric  parameter  estimates.  The  use  of  this  sensor  design  in 
conjunction  with  atmospheric  parameter  estimation  techniques  should  be  investigated.  As 
an  example,  this  sensor  design  might  be  used  in  a  feedback  loop  configuration  with  existing 
ro  estimators  to  improve  the  overall  AO  system  performance. 

The  research  here  considered  performance  for  a  single  sensor  subaperture.  Phase 
reconstruction  using  higher  order  modal  estimates  continues  to  be  a  subject  of  research. 
Future  work  in  this  area  includes  the  design  and  performance  of  a  fast  wavefront  recon¬ 
struction  method  using  Zernikes  2  through  10. 

In  the  area  of  phase  screen  generation,  I  suggest  trying  to  reduce  the  percent  error  in 
screen  structure  outside  the  inertial  range.  Large  spatial  correlation  errors  created  by  the 
inherent  periodicity  of  the  Fourier  transform  might  be  reduced  via  implementation  using 
some  other  transform  technique  that  allows  for  decorrelating  long  spatial  distances  within 
the  screen  realizations.  It  may  also  be  worth  while  to  investigate  the  use  of  digital  filtering 
techniques  and  spectral  estimation  to  further  reduce  error  in  phase  screen  generation  via 
a  feedback  loop.  For  instance,  the  spectrum  could  be  estimated  using  screen  realizations. 
If  the  choice  of  spectral  sample  points  was  somehow  tied  to  the  screen  structure  error  then 
the  choice  of  sample  locations  could  be  modified  to  reduce  that  error.  This  would  provide  a 
measure  with  which  to  define  the  ideal  set  of  frequency  sample  locations  for  a  given  number 
of  frequency  sample  points. 
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